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ABSTRACT 


The  first  part  of  this  thesis  aims  to  identify  and  analyze  what  aspects  of  the  MIL- 
HDBK-217  prediction  model  are  causing  the  large  variation  between  prediction  and  field 
reliability.  The  key  findings  of  the  literature  research  suggest  that  the  main  reason  for  the 
inaccuracy  in  prediction  is  because  of  the  constant  failure  rate  assumption  used  in  MIL- 
HDBK-217  is  usually  not  applicable.  Secondly,  even  if  the  constant  failure  rate 
assumption  is  applicable,  the  disparity  may  still  exist  in  the  presence  of  design  and 
quality  related  problems  in  new  systems.  A  possible  solution  is  to  apply  reliability  growth 
testing  (RGT)  to  new  systems  during  the  development  phase  in  an  attempt  to  remove 
these  design  deficiencies  so  that  the  system’s  reliability  will  grow  and  approach  the 
predicted  value.  In  view  of  the  importance  of  RGT  in  minimizing  the  disparity,  this  thesis 
provides  a  detailed  application  of  the  AMSAA  Extended  Reliability  Growth  Models  to 
the  reliability  growth  analysis  of  a  combat  system.  It  shows  how  program  managers  can 
analyze  test  data  using  commercial  software  to  estimate  the  system  demonstrated 
reliability  and  the  increased  in  reliability  due  to  delayed  fixes. 
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EXECUTIVE  SUMMARY 


One  of  the  major  problems  in  today’s  military  systems  is  the  issue  of  poor 
reliability,  and  the  inconsistency  between  predicted  and  field  reliability.  Experience  has 
shown  that  two  reasons  are:  1)  the  inability  to  consistently  predict  field  reliability  using 
reliability  prediction  models,  and  2)  inadequate  emphasis  on  reliability  testing  prior  to 
system  fielding,  as  more  emphasis  is  being  placed  on  meeting  perfonnance  requirements 
than  reliability  requirements.  The  first  part  of  this  thesis  aims  to  identify  and  analyze 
what  principal  aspects  of  the  MIL-HDBK-217  prediction  model  are  causing  the  large 
variation  between  prediction  and  field  reliability  with  the  ultimate  goal  of  minimizing  the 
gap.  The  second  part  of  the  thesis  demonstrates  how  the  Duane  reliability  growth  model 
can  be  used  as  a  useful  tool  for  the  purpose  of  reliability  growth  planning  and  also  to 
apply  the  AMSAA  Extended  Reliability  Growth  Models  for  analyzing  reliability  growth. 

The  key  findings  of  the  literature  research  suggest  that  the  main  issues  are  some 
of  the  inherent  assumptions  of  the  MIL-HDBK-217  prediction  model.  First,  the  constant 
failure  rate  assumption  that  has  been  generally  applied  in  reliability  prediction  is  usually 
not  applicable.  However,  Drenick’s  theorem  has  proven  that  complex  repairable  systems, 
under  certain  constraints,  can  be  well  represented  by  the  exponential  distribution.  The 
reliability  engineer  must  be  able  to  recognize  when  the  mathematical  simplicity  of  the 
constant  failure  rate  model  can  be  used  without  a  substantial  penalty  in  prediction 
accuracy.  Secondly,  the  lack  of  accurate  failure  rates  data  is  also  another  reason  because 
the  task  of  acquiring  field  data  of  components  is  very  time  consuming.  A  well  designed 
part  is  less  likely  to  fail  early,  leading  to  extended  waiting  time  for  any  useful 
information.  A  possible  solution  is  to  apply  accelerated  life  testing  to  components  to 
shorten  the  waiting  time  required  for  acquiring  failure  rates  data.  Lastly,  even  if  the 
exponential  distribution  is  applicable,  the  disparity  between  predicted  and  field  reliability 
may  still  exist  in  new  systems  because  of  unexpected  failure  modes  that  may  arise  in  the 
presence  of  design  and  quality  deficiencies  which  will  prevent  the  system  from  reaching 
the  predicted  value.  A  possible  solution  is  to  apply  reliability  growth  testing  (RGT)  to 
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new  systems  during  the  development  phase  in  an  attempt  to  remove  these  design 
deficiencies  so  that  the  system’s  reliability  will  grow  and  approach  the  predicted  value.  In 
contrast  to  the  MIL-HDBK-217  prediction  model,  AMSAA  reliability  growth  models 
assume  that  system  failures  during  development  follow  a  Non-Homogeneous  Poisson 
Process  (NHPP). 

In  view  of  the  importance  of  RGT  in  minimizing  the  disparity,  this  thesis  provides 
a  detailed  application  of  the  AMSAA  Extended  Reliability  Growth  Models  to  the 
reliability  growth  analysis  of  a  combat  system.  It  shows  how  program  managers  can 
analyze  test  data  using  commercial  software  to  estimate  the  system’s  demonstrated 
reliability  and  the  increased  in  reliability  due  to  delayed  fixes.  The  example  combat 
system  consists  of  two  main  subsystems.  The  reliability  growth  for  both  the  subsystems  is 
tracked  over  three  phases  of  testing.  Reliability  is  tracked  on  a  phase  by  phase  basis  using 
test  data  collected  within  each  test  phase.  The  type  of  reliability  growth  model  selected 
for  is  based  on  the  type  of  management  approach  employed  in  each  test  phase.  The  three 
types  of  AMSAA  reliability  growth  models  are:  1)  AMSAA  Extended  Model  for  Test- 
Fix-Test,  2)  AMSAA  Extended  Test-Find-Test  Projection  Model,  and  3)  AMSAA 
Extended  Model  for  Test-Fix -Find-Test. 

The  results  of  the  reliability  analysis  for  the  combat  system  show  that  the 
demonstrated  system  reliability  for  both  subsystems  is  initially  low  but  improves  as 
testing  progresses.  Reliability  is  finally  estimated  to  meet  the  predicted  value  as  failure 
modes  are  discovered  and  eliminated  through  the  Test- Analyze- And-Fix  (TAAF)  process 
towards  the  target  reliability  by  application  of  the  TAAF  approach.  I  conclude  that  the 
application  of  RGT  during  the  developmental  phase  is  effective  in  minimizing  the 
disparity  between  predicted  and  field  reliability.  Systems  that  bypass  development  testing 
will  experience  low  reliability  in  the  field,  which  is  one  of  the  main  causes  of  disparity 
between  predicted  and  field  reliability. 

There  are  also  some  important  lessons  learned  on  the  use  of  the  reliability  growth 
models  from  this  thesis.  For  the  Duane’s  Model,  the  total  test  time  required  for  an  RGT 
program  is  sensitive  to  the  system’s  initial  reliability,  initial  test  time,  and  growth  rate.  In 


xvi 


most  practical  cases,  the  total  test  time  is  usually  fixed  due  to  time  and  resources 
available  in  the  development  program 

The  use  of  failure  mode  designation  in  AMSAA  Extended  Reliability  Growth 
Models  has  proven  to  be  beneficial  as  it  can  provide  many  useful  metrics  in  reliability 
growth  analysis  and  for  decision  making  during  the  test  program.  They  are:  1)  initial 
system  reliability  at  the  beginning  of  a  test  phase,  2)  the  average  effectiveness  factor  (EF) 
of  remedying  failure  modes,  3)  fraction  of  seen  and  unseen  Type  BD  failure  modes,  and 
4)  system  failure  rate  breakdown  for  individual  failure  modes.  Knowing  the  failure  rate 
breakdown  of  individual  failure  modes  in  the  system  is  important  as  it  enables  easy 
identification  of  failure  modes  with  relatively  high  failure  rate.  It  is  also  important  to  note 
that  the  final  system  reliability  is  sensitive  to  the  assigned  value  of  EF  for  Type  BD 
failure  modes.  To  prevent  over  estimation  of  the  system  final  reliability,  a  conservative 
EF  should  be  assigned  since  the  actual  effectiveness  of  the  delayed  fixes  cannot  be 
determined  without  further  testing. 

For  new  systems  under  development,  the  use  of  the  AMSAA  NHPP  model 
provides  a  better  representation  of  the  system’s  failure  rate  than  the  exponential 
distribution  because  the  failure  rate  is  varying  with  time  as  testing  progresses.  Once  the 
system  matures  through  a  period  of  testing  and  reliability  growth  has  reached  a  plateau, 
the  system’s  failure  rate  will  tend  towards  being  well  represented  by  an  exponential 
distribution. 
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I.  INTRODUCTION 


A.  BACKGROUND 


Reliable  weapon  systems  are  critical  elements  for  fighting  and  winning  wars,  and 
reliability  is  an  effective  force  multiplier  that  contributes  towards  higher  operational 
readiness  and  a  reduced  logistics  footprint.  One  of  the  major  problems  in  today’s  military 
systems  is  the  issue  of  poor  reliability,  and  the  inconsistency  between  predicted  and  field 
reliability.  Experience  has  shown  that  two  reasons  for  this  inconsistency  are:  1)  the 
inability  to  consistently  predict  field  reliability  using  reliability  prediction  models,  and  2) 
inadequate  emphasis  on  reliability  testing  prior  to  system  fielding  as  more  emphasis  is 
being  placed  on  meeting  performance  requirements  than  reliability  requirements  [Ref.  1 
and  Ref.  2]. 

This  chapter  first  introduces  the  issues  concerning  the  inability  to  predict  field 
reliability,  the  importance  of  reliability  testing  for  military  systems,  and  follows  by 
introducing  the  concept  of  reliability  growth.  The  scope  and  objectives  of  this  research 
are  then  presented  along  with  the  potential  benefits. 

Within  the  military,  there  is  a  need  in  the  early  stages  of  the  development  program 
to  accurately  predict  the  expected  field  reliability  of  military  systems  for  logistics  and 
operational  planning  purposes.  These  include  the  determination  of  spares  quantity, 
forecast  of  maintenance  support  cost,  life  cycle  cost,  and  systems  availability  analysis. 
These  analyses  require  accurate  reliability  predictions.  Research  has  shown,  however, 
that  the  field  reliability  of  weapon  systems  has  often  failed  to  measure  up  to  its  predicted 
Mean-Time-Between-Failure  (MTBF)  [Ref.  1], 

Empirically  it  has  been  found  that  the  ratio  of  the  predicted  MTBF  to  its  field 
MTBF  for  military  systems  can  vary  by  as  much  as  20:1  [Ref.  1].  Table  1  presents  some 
examples  of  this  disparity. 


1 


Equipment 

Reliability  Ratio 

Predicted:  Field 

Airborne  Avionics 

>20:1 

Airborne  Radar 

5.0:1 

Airborne  Fighter 

9.1:1 

Airborne  Transport 

2.3:1 

Table  1.  Ratio  disparity  between  predicted  and  field  MTBF  [After  Ref.  1] 


Reliability  prediction  is  performed  during  the  early  design  phase,  when  the 
prototype  is  not  yet  built,  to  estimate  the  expected  field  reliability  of  the  system.  The 
most  widely  used  prediction  method  in  the  military  is  the  MIL-HDBK-217.  Although 
DoD  has  discontinued  updates  of  MIL-HDBK-217F,  this  standard  is  still  widely  used  in 
the  military.  Its  effectiveness  has  not  been  clearly  established  since  it  has  been  shown  that 
there  exist  large  variations  between  predicted  and  field  reliability.  Research  efforts  are 
required  to  examine  the  problems  of  the  MIL-HDBK-217  prediction  model  that  have 
caused  this  disparity. 

The  inability  to  relate  predicted  reliability  to  field  reliability  could  have  severe 
impact  from  both  the  logistics  and  operational  perspective.  A  recent  analysis  performed 
on  the  Comanche  helicopter  by  an  NPS  student  indicates  that  missing  the  predicted 
availability  by  just  one  percent  could  increase  the  life-cycle  Operation  &  Support  (O&S) 
cost  by  more  than  $75  million  [Ref.  3]. 

As  important  as  reliability  prediction  is,  its  value  starts  to  diminish  once 
prototypes  are  built  and  the  reliability  can  be  assessed  via  testing.  Reliability  prediction 
and  reliability  testing  play  different  roles  but  they  complement  one  another  at  different 
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stages  of  the  product  development  cycle.  Reliability  testing  is  performed  to  ensure  that 
the  fielded  system  meets  the  specified  level  of  reliability. 

Over  the  years,  there  were  numerous  reported  cases  of  military  systems  exhibiting 
poor  reliability.  One  example  is  the  Hunter  Unmanned  Aerial  Vehicle  (UAV)  System 
[Ref.  4].  The  urgent  need  for  the  US  Army  to  have  a  UAV  System  forced  the  Hunter 
System  to  be  fielded  without  going  through  its  development  phase  which  means  that  the 
system  was  not  adequately  tested.  Consequently,  several  Air  Vehicles  (AVs)  were  lost 
due  to  various  failures  and  that  finally  resulted  in  a  decision  to  terminate  the  production 
program.  The  lesson  learned  is  to  recognize  the  significance  of  reliability  testing  during 
the  development  phase.  Reliability  can  only  be  validated  with  rigorous  testing  under 
actual  combat  conditions.  This  is  especially  important  for  complex  and  state-of-the-art 
weapon  systems.  There  are  too  many  uncertainties  and  risks  involved,  especially  in  the 
area  of  systems  design,  and  it  is  virtually  impossible  for  designers  to  predict  in  advance 
all  possible  sources  of  failure  modes.  Failure  to  achieve  an  acceptable  level  of  reliability 
at  this  late  stage  of  development  can  have  a  devastating  impact  on  the  program,  including 
fielding  a  less  reliable  weapon  system  and  incurring  additional  cost  for  retesting  and 
redesign. 

Reliability  testing  does  not  guarantee  that  reliability  targets  will  be  met  ultimately 
but  having  a  strong  emphasis  on  reliability  testing  should  substantially  increase  the 
chances  of  meeting  these  objectives.  During  system  development,  the  eventual  goal  for 
the  system’s  reliability  is  known  as  the  reliability  target.  However,  the  initial  prototypes 
produced  will  almost  certainly  contain  design,  quality,  and  other  engineering  related 
flaws  that  prevent  a  prototype  from  reaching  the  target  immediately.  In  order  to  improve 
the  reliability,  the  prototypes  are  subjected  to  intensive  testing  to  identify  and  implement 
corrective  actions  to  improve  the  design.  This  process  of  testing,  fixing,  and  testing  to 
increase  the  system’s  reliability  is  known  as  reliability  growth.  Reliability  growth  is 
generally  quantified  by  an  increase  in  mean  time  between  failures  over  time.  The 
intervals  between  failures  will  become  longer  on  average  if  there  is  positive  reliability 
growth.  On  the  other  hand,  if  negative  growth  is  occurring,  these  intervals  will  tend  to  be 
shorter.  For  no  growth,  the  intervals  will  retain  the  same  mean. 
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The  estimation  of  system  reliability  involves  the  use  of  a  reliability  growth  model. 
A  reliability  growth  model  is  an  analytical  model  that  represents  the  reliability  of  the 
system  during  the  development  process.  It  accounts  for  the  changes  in  reliability  due  to 
all  corrective  actions  incorporated  during  the  developmental  phase.  The  basic  principle  of 
a  reliability  model  is  to  apply  the  failure  data  collected  during  prototype  testing  to 
determine  the  reliability  of  the  system.  A  reliability  model  is  also  used  for  developing  a 
test  plan  to  determine  the  amount  of  test  time  required  to  meet  the  reliability  targets. 
Once  the  test  plan  is  developed,  the  model  can  be  used  as  data  is  collected  to  estimate  the 
reliability  of  the  system  during  the  test  phase  in  order  to  know  how  much  additional 
testing  is  required  to  meet  the  target.  Extrapolating  a  growth  curve  beyond  the  current 
data  estimates  what  reliability  a  program  can  be  expected  to  achieve  providing  that  the 
conditions  of  the  test  and  the  engineering  effort  to  improve  reliability  are  maintained  at 
their  present  levels. 

Although  many  models  existed  for  modeling  reliability  growth,  the  Duane  and  the 
US  Army  Materiel  Systems  Analysis  Activity  (AMSAA)  models  are  among  the  most 
widely  used  in  the  military  [Ref.  5].  The  detenninistic  nature  of  the  Duane’s  model  is 
commonly  used  for  constructing  the  idealized  growth  curve  in  reliability  growth 
planning.  The  AMSAA  model  employs  the  Weibull  process  to  model  reliability  growth 
and  its  statistical  nature  allows  estimation  of  unknown  parameters  using  test  data  which 
makes  it  a  useful  tool  for  reliability  assessment. 


B.  OBJECTIVES  AND  SCOPE  OF  RESEARCH 


The  first  part  of  this  thesis  aims  to  identify  and  analyze  what  principal  aspects  of 
the  MIL-HDBK-217  prediction  model  are  causing  the  large  variation  between  prediction 
and  field  reliability. 

The  second  part  of  the  thesis  aims  to  demonstrate  the  use  of  the  Test-Analyze- 
And-Fix  (TAAF)  concept  for  the  reliability  growth  analysis  of  a  combat  system.  The 
main  intent  is  to  demonstrate  how  the  Duane  reliability  growth  model  can  be  used  as  a 


4 


useful  tool  to  construct  an  idealized  growth  curve  for  the  purpose  of  reliability  growth 
planning  and  also  to  apply  the  AMSAA  Extended  Reliability  Growth  Models  for 
analyzing  reliability  growth. 

Lastly,  lessons  learned  and  recommendations  on  reliability  growth  based  on  this 
research  will  be  presented  in  this  thesis. 


C.  POTENTIAL  BENEFITS  OF  RESEARCH 


This  research  consolidates  some  important  findings  that  has  given  rise  to  the 
inaccuracy  in  the  MIL-HDBK-217  reliability  prediction  with  the  ultimate  goal  of 
minimizing  the  gap  between  predicted  and  field  reliability. 

This  thesis  also  shows  how  program  managers  can  plan  and  analyze  test  data  using 
commercial  software  to  estimate  the  system’s  demonstrated  reliability  and  estimate  the 
increased  in  reliability  due  to  delayed  fixes. 


5 


THIS  PAGE  INTENTIONALLY  LEFT  BLANK 


6 


II.  ANALYSIS  OF  THE  RELIABILITY  DISPARITY 


A.  INTRODUCTION 


Within  the  military,  accurate  prediction  of  system  reliability  plays  a  critical  role 
from  both  the  logistics  and  operational  perspective.  MTBF  figures  are  used  for  many 
logistics  and  operational  planning  activities  [Ref.  6].  They  include  the  following: 

Spares  Provisioning.  Detennination  of  spare  quantities  purchased  to  meet 
operational  availability.  Components  with  higher  failure  rates  needs  to  be  stocked  at  a 
higher  number. 

Development  of  Maintenance  Strategies.  In  many  cases,  MTBF  is  used  to 
determine  the  preventive  maintenance  intervals  of  a  component. 

Estimation  of  Life  Cycle  Cost.  Estimation  of  the  total  system  cost  on  a  yearly 

basis. 

Unfortunately,  there  are  a  host  of  factors  that  give  rise  to  the  disparity  between 
predicted  and  field  MTBF.  The  focus  here  is  to  identify  and  analyze  principal  aspects  of 
the  MIL-HDBK-217  prediction  model  that  are  causing  the  large  variation  between 
prediction  and  field  reliability. 

The  remaining  of  this  chapter  will  first  discuss  the  key  concepts  pertinent  to  the 
understanding  of  the  research  theme  which  include  the  “bathtub”  curve,  the  exponential 
distribution,  and  the  principles  of  reliability  prediction  and  follow  by  a  discussion  on  the 
results,  conclusions  and  recommendations. 

B.  KEY  CONCEPTS 


This  section  provides  a  fundamental  understanding  of  the  key  concepts  related  to 
reliability  prediction  such  as  the  “bathtub”  curve,  the  exponential  distribution  and  also  the 
principles  of  reliability  prediction  in  order  to  understand  the  research  theme— the 
reliability  disparity. 
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1. 


The  Bathtub  Curve 


Figure  1  shows  a  “bathtub”  curve  that  is  often  used  in  the  field  of  reliability  to 
describe  the  failure  rate  behavior  of  a  system  over  its  life  cycle.  The  vertical  axis  of  the 
“bathtub”  curve  represents  the  hazard  rate  or  the  instantaneous  failure  rate.  The  hazard 
rate  applies  only  to  non  repairable  systems  in  which  only  one  failure  can  occur.  For 
repairable  systems  the  term  failure  rate  or  rate  of  occurrence  of  failure  is  more 
appropriate.  The  “bathtub”  curve  consists  of  three  distinct  regions:  infant  mortality, 
useful  life  and  wear-out  [Ref.  7]. 

The  infant  mortality  region  exhibits  a  decreasing  failure  rate,  characterized  by 
early  failures  attributable  to  defects  in  design,  manufacturing  or  construction.  The  failure 
rate  decreases  with  time  as  the  design  defects  are  detected  and  repaired.  The  failure  rate  is 
the  probability  of  failure  in  the  next  interval  of  time  given  that  an  item  has  survived  to  a 
certain  age,  divided  by  the  length  of  the  interval.  It  is  an  important  function  in  reliability 
analysis  since  it  shows  changes  in  probability  of  failure  over  the  lifetime  of  a  system.  One 
way  to  eliminate  such  failures  is  through  design  and  production  quality  control  measures 
that  will  reduce  variability  and  hence  infant  mortality  failures  [Ref.  12]. 

The  useful  life  region  by  assumption  has  a  reasonably  constant  failure  rate, 
characterized  by  random  failures.  These  failures  are  likely  caused  by  unavoidable  load 
rather  than  any  inherent  defect  in  the  system.  There  are  many  forms  of  possible  external 
loadings  such  as  temperature  fluctuations,  vibration,  power  surges  and  moisture  variation. 
Random  failures  can  be  reduced  by  increasing  the  robustness  of  the  design  and/or 
controlling  the  external  environment. 

The  wear-out  region  has  an  increasing  failure  rate  characterized  by  the  aging 
phenomena.  The  typical  failure  mechanisms  are  corrosion,  fatigue  cracking, 
embrittlement,  and  diffusion  of  materials. 

In  reliability  prediction,  the  failure  rate  of  a  system  has  often  been  assumed  to  be 
constant  which  resembles  the  useful  life  region  of  the  bathtub  curve  as  shown  in  Figure  1. 
In  reality,  the  assumption  of  constant  failure  rate  is  more  representative  of  an  electronic 
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system  rather  than  a  mechanical  system.  Failure  occurrences  in  electronic  systems  are 
considered  as  random  and  are  assume  to  follow  a  Poisson  process. 

On  the  other  hand,  the  failure  distribution  of  mechanical  hardware  is  characterized 
by  an  initial  wear-in  period  and  followed  by  a  long  span  of  increasing  failure  rate.  The 
primary  failure  mechanisms  for  mechanical  systems  are  corrosion,  fatigue  and  other 
cumulative  effects. 


Figure  1.  The  “bathtub”  curve  [From  Ref.  7] 

2.  The  Exponential  Distribution 

The  exponential  distribution  models  the  failure  rate  in  the  useful  life  region  of  the 
bathtub  curve  as  it  assumes  that  the  rate  at  which  the  system  fails  is  independent  of  its 
cumulative  age  [Ref.  8].  This  assumption  greatly  simplifies  the  mathematics  involved  in 
reliability  analysis  as  it  is  much  easier  to  calculate  with  an  assumed  constant  failure  rate 
than  to  derive  the  parameters  of  a  two-parameter  distribution  (e.g,.  Weibull).  This  is  one 
of  the  main  reasons  for  its  wide  application  in  many  reliability  analyses. 
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Further,  it  is  the  lack  of  memory  property  of  the  exponential  distribution  that 
assumes  a  repaired  system  is  as  good  as  new.  For  the  exponential  distribution,  reliability 
as  a  function  of  time  and  failure  rate,  X  ,  is  written  as 

R(t)  =  e~u .  (2.1) 


3.  Reliability  Prediction  Model 

MIL-HDBK-217,  the  Military  Handbook  for  “Reliability  Prediction  of  Electronic 
Component”  is  the  standard  reference  used  in  the  military  for  reliability  prediction  of 
electronic  equipment  parts.  It  was  published  by  the  Department  of  Defense  (DoD)  in  the 
1960s.  Since  then  it  has  been  updated  several  times,  with  the  most  recent  Revision  F 
Notice  2,  released  in  February  1995  [Ref.  9].  Table  2  shows  some  of  the  prediction 
models  available  in  the  military  and  commercial  industry. 


Model 

Description 

MIL-HDBK-217 

Original  worldwide  standard  (MIL-STD-217)  for  commercial  & 
military  electronic  components 

Telcordia  SR-332 

Original  Bellcore  standard  for  commercial  grade  electronic 
components 

PRISM 

Originally  developed  by  the  Reliability  Analysis  Center  (RAC), 
incorporates  process  grading  factors 

CNET  93 

Developed  by  France  Telecom 

Table  2.  Reliability  models/standards  [After  Ref.  12] 


Conventional  reliability  prediction  assumes  that  all  failures  are  independent.  It 
first  defines  the  failure  rate  of  all  the  key  components  that  made  up  the  system  and  sums 
them  up  to  obtain  the  overall  system  failure  rate,  assuming  a  series  system.  The  MIL- 
HDBK-217  reliability  prediction  model  assumes  a  constant  failure  rate  for  all  the 
components.  The  validity  and  usefulness  of  this  assumption  has  often  been  challenged  by 

practitioners  in  the  field  of  reliability.  Many  have  denounced  the  use  of  this  assumption 
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as  not  being  practical  as  it  assumes  that  a  system  does  not  wear  out  over  time.  MIL- 
HDBK-217  consists  of  two  approaches:  Parts-Count  and  Parts-Stress. 

a.  Parts  Count 

Parts-Count  approach  is  simpler  as  it  requires  less  information  than  the 
Parts-Stress  approach.  It  only  requires  knowledge  of  the  quantities  of  components, 
application  environment  and  quality  factor,  nQ  .  A  quality  factor  that  is  above  1 .0  implies 

a  poor  quality  component.  The  prediction  for  each  part  is  governed  by  the  application  of  a 
quality  factor  to  a  base  failure  rate.  The  quality  factor  for  most  standard  electronic 
components  can  be  found  in  MIL-HDBK-217.  This  approach  is  most  useful  in  the  early 
design  stage  when  the  system  hardware  is  not  yet  available. 

MIL-STD-217F  parts  count  defines  the  overall  equipment  failure  rate  as: 

Aequip  =YJNi^gnQ)i  (2-2) 

1  =  1 

A  =  Failure  rate  of  the  ith  generic  part 

n  =  Number  of  generic  part  categories 

Nj  =  Quantity  of  the  ith  generic  part 

7i Q  =  Quality  factor  of  the  of  the  ith  generic  part 

b.  Parts  Stress 

The  Parts-Stress  approach  is  more  complex  as  it  takes  into  account  the 
various  stress  factors  such  as  temperature,  environment,  quality,  electrical,  etc,  on  the 
component.  The  electrical  stress  is  usually  defined  as  a  ratio  of  the  operating  value  to  the 
rated  value.  For  instance,  the  defining  stress  factor  for  a  resistor  is  current.  Therefore,  the 
operating  current  and  rated  current  are  used  in  the  part  stress  calculation  model.  This 
approach  is  more  applicable  later  in  the  design  phase  when  the  hardware  and  knowledge 
of  the  operating  environment  are  available  in  order  to  estimate  the  various  stress  factors. 

The  models  for  the  MIL-HDBK-217  Parts-Stress  approach  is  much  more 
detailed  and  varied  across  part  types.  The  model  for  the  low  frequency  diode  is  shown 
below  [Ref.  17]. 
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(2.3) 


^ p  A,h7T  t7T  S7T  q7T E 


2  =  Part  failure  rate 


/L.  =  Base  failure  rate 


nT  =  Temperature  factor 


n E  =  Environment  factor 


n 0  =  Quality  factor 


nv  =  Electrical  stress  factor 


The  accuracy  of  both  approaches  is  highly  dependent  upon  the  availability 
and  accuracy  of  data  such  as  the  base  failure  rate  and  the  various  required  factors. 


C. 


MAJOR  FACTORS  AFFECTING  RELIABILITY  PREDICTION 


There  are  a  number  of  studies  that  either  directly  or  indirectly  address  the  problem 
of  the  reliability  disparity.  This  section  focuses  on  the  limitations  of  the  MIL-HDBK-217 
model.  The  key  findings  of  the  literature  research  suggest  that  the  disparity  stems  from 
some  inherent  assumptions  of  the  MIL-HDBK-217  model.  For  example,  the  constant 
failure  rate  assumption  that  has  been  generally  applied  in  reliability  prediction  is  usually 
not  applicable.  The  lack  of  accurate  field  failure  rates  of  components  or  parts  can  also 
affect  prediction  accuracy.  Further,  the  prediction  model  cannot  predict  unexpected 
failure  modes  that  occur  in  the  field  due  to  poor  design  and  poor  quality  control. 

1.  Inapplicability  of  the  Constant  Failure  Rate  Assumption  in  MIL- 
HDBK-2 1 7  Reliability  Prediction  Model 

System  failures  can  be  assumed  to  follow  a  Poisson  process  if  the  times  to  failure 
of  all  the  components  that  make  up  a  system  can  be  regarded  as  exponential  and 
component  failures  to  be  independent.  The  rate  of  failure  occurrence  of  the  system  can 
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then  be  obtained  by  summing  up  the  failure  rates  of  the  individual  components.  This  has 
been  regarded  as  reasonable  for  electronic  components,  and  thus  provides  the  basis  for 
MIL-HDBK-217  prediction  model.  The  exponential  distribution  also  assumes  all  repairs, 
no  matter  how  minor,  restore  the  system  to  an  “as  new”  condition.  This  assumption  is 
often  in  strict  contrast  to  reality  for  the  following  reasons  [Ref.  10]: 

1.  Failure  and  repair  of  one  part  may  cause  damage  to  other  parts.  Therefore,  the 
times  between  successive  failures  are  not  necessarily  independent. 

2.  Repairs  may  not  totally  renew  the  system.  Repairs  can  be  imperfect  or  they 
introduce  other  defects  leading  to  failures  of  other  parts.  The  lack  of  memory  property  of 
the  exponential  distribution  might  not  be  valid  in  every  case. 

Since  component  failures  are  not  always  independent,  the  principle  of  summing 
up  the  failure  rates  of  the  individual  components  to  obtain  the  overall  system’s  failure  rate 
might  not  result  in  the  best  estimate. 

Below  are  two  examples  to  further  describe  the  limitations  of  using  the  constant 
failure  rate  assumption  of  the  exponential  distribution  for  reliability  prediction. 

Figure  2  shows  the  results  of  using  the  exponential  and  Weibull  distributions  to 
model  the  human  mortality  rate  [Ref.  11].  Similarly,  the  failure  rate  distribution  is  also 
representative  of  a  system  with  a  short  period  of  useful  life  follows  by  a  long  period  of 
wear-out.  It  can  clearly  be  seen  from  Figure  2  that  the  exponential  distribution  has 
grossly  under-estimated  the  later  failure  rate  while  over-estimating  the  initial  failure  rate. 
In  contrast,  the  Weibull  distribution  is  more  suitable  in  such  a  situation.  The  purpose  of 
this  example  is  to  show  that  the  constant  failure  rate  assumption  does  not  apply  to  a 
system  with  a  dominant  wear-out  failure  mechanism. 
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H  um  an  M  ortality  Rate 


Figure  2.  Human  hazard  rate  analysis  [From  Ref.  1 1] 


/ 


EARLY  FAILURES  (FR^| 


constant  FP 


+ - Casa  5  ysEii-  > 

i  Ca«e It  idjhie  i 


Figure  3.  Constituent  curves  of  the  “bathtub”  curve  [From  Ref  14] 
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Further,  the  applicability  of  the  constant  failure  rate  assumption  also  hinges 
strongly  upon  the  relationship  between  the  system’s  nature  and  its  life  cycle  [Ref.  14]. 
Figure  3  shows  a  typical  bathtub  curve  with  the  three  distinct  regions.  The  failure  rate  of 
an  electronic  equipment  with  a  maximum  of  life  cycle  of  five  years  can  be  approximated 
by  Case  I.  Case  II  approximates  a  mechanical  equipment  with  a  life  cycle  of  ten  years  .  In 
comparison  between  the  two  cases,  the  results  indicate  that  the  constant  failure  rate 
assumption  has  provided  a  better  approximation  for  a  live  year  period.  The  reliability  is 
given  by  the  following  equation. 

R(t)  =  eH{t)  (2.4) 

H(f)  =  Cumulative  hazard  rate 

The  failure  rate  was  first  under-estimated  during  the  early  failure  region  and  then 
over-estimated  during  the  constant  failure  rate  region.  Overall,  it  still  provides  a  fairly 
good  approximation. 

On  the  other  hand,  the  error  between  prediction  and  actual  is  simply  too  great  for 
a  ten  year  period  due  to  the  relatively  long  period  of  increasing  failure  rate  in  the  wear- 
out  region.  This  brings  to  an  important  conclusion  that  the  use  of  the  constant  failure  rate 
assumption  is  highly  dependent  upon  the  system’s  life  cycle. 

In  addition,  a  similar  conclusion  that  can  be  drawn  from  the  two  previous 
examples  is  that  the  constant  failure  rate  assumption  tends  to  produce  a  conservative 
estimate  of  the  system’s  overall  failure  rate  that  is  dependent  upon  the  relative  period  of 
wear-out  region  over  its  life  cycle.  As  observed  from  Figure  3,  the  wider  the  wear-out 
region  over  the  life  cycle,  the  greater  will  be  the  error  margin.  This  further  reinforces  the 
point  that  it  is  not  suitable  for  predicting  failure  rates  of  a  system  with  a  dominant  wear- 
out  failure  mechanism.  Reliability  prediction  using  this  assumption  for  a  system 
characterized  by  a  long  period  of  wear-out  provides  little  insights  from  the  logistics 
planning  perspective  as  it  can  result  in  severe  spares  under-purchased.  All  these  reasons 
explain  why  reliability  prediction  using  the  constant  failure  rate  assumption  often  yields 
inconsistent  results  from  field  reliability. 
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2. 


Lack  of  Accurate  Failure  Rates  Data 


A  reliability  prediction  model  is  effectively  a  set  of  “best  guesses”  and  to  achieve 
any  degree  of  accuracy  they  must  use  empirically  acquired  field  data.  Prediction  accuracy 
to  a  large  extent  depends  on  the  amount  of  field  data  available  and  the  painful  fact  is  that 
data  collection  takes  a  long  time  [Ref.  13]. 

The  task  of  acquiring  field  data  of  components  is  not  a  simple  task  because  it 
takes  time  for  a  component  to  fail  before  meaningful  data  on  failure  rates  can  be 
gathered.  A  well  designed  part  is  less  likely  to  fail  early,  leading  to  extended  waiting 
time  for  any  useful  information.  Because  the  task  is  so  time  consuming,  there  are 
relatively  few  sources,  usually  from  the  manufacturers  themselves.  The  largest  sources  of 
field  data  are  the  Non-electronic  Parts  Reliability  Data  (NPRD-95)  and  Electronic  Parts 
Reliability  Data  (EPRD-97)  produced  by  the  military  [Ref.  18].  These  were  compiled 
through  years  of  observation,  repair  records,  and  other  activities.  Since  failure  rate 
depends  mainly  on  design  and  application,  these  data  are  not  representative  of  all  cases. 
Further,  the  rapid  development  of  electronic  technology  limits  the  ability  to  collect  ample 
data  for  any  particular  technology. 

A  possible  solution  to  shorten  the  waiting  time  for  acquiring  failure  rates  data  is  to 
apply  accelerated  life  testing  (ALT)  to  components.  Accelerated  life  testing  are 
component  life  tests  with  components  operated  at  high  stress  and  failure  data  observed 
[Ref.  22]. 


3.  Inability  to  Predict  Unexpected  Failures  Modes  Due  To  Poor  Design 
and  Quality  Related  Problems 


Westinghouse  Defense  and  Electronic  Center  perfonned  a  case  study  on  a 
complex  Electric  Countermeasures  (ECM)  military  radar  system  that  underwent  a 
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Reliability  Demonstration  Test  (RDT)  to  study  the  differences  between  predicted  and 
field  reliability  and  analyzed  these  problems  in  light  of  the  MIL-HDBK-217  prediction 
model  [Ref.  15]. 


Predicted  MTBF 

RDT  MTBF 

Radar  System 

282  hours 

100  hours 

Table  3.  Predicted  and  RDT  MTBF  [After.  Ref.  15] 


It  was  found  that  the  main  differences  are  related  to  the  assumptions  made  of  the 
quality  of  design  and  the  adherence  to  the  established  and  specified  quality  control 
procedures  in  producing  the  parts.  The  MIL-HDBK-217  model  inherently  assumes  that 
certain  standards  are  followed  in  these  areas  based  on  specified  engineering  requirements 
but  this  assumption  is  not  always  valid  in  all  cases.  Two  examples  of  failures  that  were 
identified  during  the  RDT  test  will  be  presented  to  support  this  claim. 

The  first  failure  to  be  discussed  is  due  to  a  design  deficiency  of  a  thin  film  RF 
amplifier.  This  failure  arises  because  of  inadequate  clearance  between  the  toroid  of  the 
RF  amplifier  and  the  lid  case  that  was  not  foreseen  during  the  initial  design  of  the  device. 
The  toroid  was  being  mounted  too  closely  to  the  device  lid  and  that  subsequently  resulted 
in  a  short  circuit  due  to  contact  with  the  lid  of  the  case  after  several  cycles  of  thermal 
cycling  that  caused  the  toroid  to  move  relative  to  its  original  position.  The  assumption 
during  the  initial  reliability  prediction  of  this  device  was  that  all  design  considerations  for 
this  device  were  completely  satisfied.  Obviously,  these  assumptions  were  not  valid  for 
this  case. 

The  second  failure  concerns  thick  film  devices  that  consists  of  many  discrete  parts 
and  solder  joints.  Solder  balls  form  as  a  result  of  the  solder  flow  process  being  out  of 
control,  in  that  the  solder  flow  temperature  deviated  from  the  specified  range  during  the 
device  manufacturing  process.  The  resulting  solder  balls,  which  were  loosely  attached  at 
various  points  of  the  device,  broke  loose  and  lodged  between  various  chips  and  substrate 
causing  component  to  substrate  short  circuit.  Reliability  prediction  is  unable  to  predict 
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these  unexpected  failure  modes  that  arise  as  a  result  of  poor  quality  control  as  it 
inherently  assumes  that  all  processes  are  in  proper  control. 

The  two  failures  previously  described  were  a  direct  result  of  poor  design  and  lack 
of  quality  control,  respectively,  which  were  identified  during  RDT.  During  the  initial 
reliability  prediction  prior  to  production  and  test,  it  is  almost  impossible  to  know  in 
advance  how  good  these  control  methods  and  engineering  designs  would  be.  Therefore  it 
is  extremely  important  to  be  aware  of  the  differences  between  the  inherent  assumptions 
of  the  MIL-HDBK-217  prediction  model  and  the  many  uncertainties  that  can  happen 
during  the  actual  engineering  process. 


D.  RECOMMENDATIONS 


The  constant  failure  rate  model  is  mathematically  simple  for  reliability  prediction 
but  it  is  not  always  applicable.  It  serves  as  a  good  approximation  for  a  system  that  is 
characterized  by  a  long  period  of  useful  life  and  a  short  period  of  early  failure.  In  order  to 
improve  the  precision  of  reliability  prediction,  the  reliability  engineer  must  be  able  to 
recognize  when  the  mathematical  simplicity  of  the  constant  failure  rate  model  can  be 
used  without  a  substantial  penalty  in  prediction  accuracy.  This  can  be  achieved  by 
analyzing  the  failure  rate  distribution  of  a  system  over  its  intended  life  and  deciding  if  the 
exponential  distribution  is  applicable.  The  failure  rate  distribution  of  a  system  can  be 
estimated  by  analyzing  the  failure  trends  of  similar  class  of  systems. 

It  is  also  important  to  be  aware  of  Drenick’s  Theorem  that  has  proven  that 
complex  repairable  systems,  under  certain  constraints,  tend  towards  being  well 
represented  by  the  exponential  distribution  [Ref.  18].  Given  that  most  military  systems 
(aircraft,  artillery  guns,  or  naval  ships)  are  usually  composed  of  a  large  number  of 
components,  it  would  seem  that  the  constant  failure  rate  assumption  is  applicable.  The 
usefulness  of  Drenick’s  Theorem  depends  on  the  following  constraints.  These  constraints 
are:  1)  the  subcomponents  must  be  in  series.  2)  The  subcomponents  fail  independently.  3) 
A  failed  component  is  replaced  immediately.  4)  The  replaced  subcomponent  must  be 
identical.  5)  A  few  system  repairs  have  already  been  made.  Once  these  conditions  are  met 
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and  as  the  number  of  subcomponents  increases,  system  failures  will  tend  to  be 
exponentially  distributed  regardless  of  the  failure  distributions  of  the  subcomponents. 
This  proof  allows  reliability  practitioners  to  disregard  the  failure  distribution  of  the 
individual  components  that  make  up  the  system  since  the  overall  system  will  fail 
exponentially 

However,  one  must  be  aware  that  using  the  constant  failure  rate  assumption  has 
the  tendency  to  produce  a  conservative  estimate  of  the  overall  system  failure  rate  and  this 
is  important  from  the  logistics  perspective  especially  in  the  purchase  of  spares. 
Alternatively,  the  Weibull  distribution  provides  a  possibly  more  accurate  prediction  but  it 
will  increase  the  mathematical  complexity.  There  is  always  a  tradeoff  between  accuracy 
and  mathematical  complexity. 

Even  if  the  exponential  distribution  can  be  used  to  model  a  system,  the  disparity 

between  predicted  and  field  reliability  may  still  exist  in  new  systems  because  of 

unexpected  failure  modes  that  may  arise  in  the  presence  of  design  and  quality 

deficiencies  which  will  prevent  the  system  from  reaching  the  predicted  value.  One 

possible  solution  to  eliminate  or  reduce  the  frequency  of  occurrence  of  unexpected 

failures  in  the  field  is  to  apply  reliability  growth  testing  (RGT)  during  the  development 

phase.  In  contrast  to  the  exponential  distribution,  AMSAA  reliability  growth  models  used 

for  reliability  growth  analysis  assume  that  system  failure  rate  follows  a  Non 

Homogeneous  Poisson  Process  (NHPP).  Reliability  growth  testing  recognizes  that  the 

drawing  board  design  of  a  complex  product  cannot  be  perfect  from  the  reliability  point  of 

view  and  allocates  necessary  time  and  resources  to  fine  tune  the  design  by  finding  those 

problems  that  are  impossible  to  know  in  advance  during  reliability  prediction  and 

designing  them  out.  It  follows  the  formal  process  of  Test- Analyze- And -Fix  (TAAF) 

which  involves  testing  the  system  to  surface  all  possible  failure  modes,  analyzing  the 

underlying  failure  mechanism  to  detennine  its  root  causes,  implementing  corrective 

actions  to  improve  the  design  and  finally  re-testing  to  verify  the  effectiveness  of  the 

corrective  actions  to  prevent  future  occurrences.  Once  the  system  matures  through  a 

period  of  testing  and  reliability  growth  has  reached  a  plateau,  the  system’s  failure  rate 

will  tends  towards  well  represented  by  an  exponential  distribution.  Consequently,  the 

disparity  between  predicted  and  field  MTBF  can  be  minimized.  It  is  also  important  to 
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realize  that  in  order  to  maximize  the  benefits  of  a  reliability  growth  program,  it  has  to  be 
conducted  as  early  as  possible  in  the  development  phase  once  the  prototype  is  available. 
The  earlier  these  problems  are  identified,  the  better  it  is  so  that  more  time  will  be 
available  to  verify  the  effectiveness  of  the  design  changes.  Furthermore,  the  cost 
associated  with  redesigning  a  product  late  in  the  development  cycle  is  extremely  high. 

The  remaining  chapters  of  this  thesis  will  discuss  the  reliability  growth  testing  of 
a  155mm  SPH  artillery  gun  using  reliability  growth  methodology.  The  next  chapter  first 
introduces  the  reliability  growth  methodology  and  follows  by  illustrating  the  use  of  the 
Duane’s  model  to  detennine  the  essential  parameters  needed  for  constructing  the 
idealized  growth  curve  as  part  of  reliability  growth  planning. 
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III.  RELIABILITY  GROWTH  PLANNING 


A.  INTRODUCTION 


Reliability  growth  is  the  improvement  of  a  product’s  reliability  over  time  (hence 
the  term  growth)  using  the  TAAF  philosophy  through  learning  about  the  deficiencies  of 
the  design  and  taking  action  to  eliminate  or  minimize  the  effect  of  these  deficiencies.  The 
growth  in  reliability  is  quantified  by  a  decrease  in  system’s  failure  rate  or  increase  in  the 
test  phase  average  MTBF  over  time  due  to  the  removal  of  failure  sources.  Figure  4 
reflects  a  decreasing  trend  in  failure  rate  which  signifies  reliability  improvement  over 
time. 


TEST  TIME 

Figure  4.  Failure  rate  versus  time  [From  Ref.  5] 

The  success  of  a  reliability  growth  program  is  dependent  on  factors  including  the 
initial  planning  of  the  reliability  program  and  an  accurate  assessment  of  the  system’s 
current  reliability  status.  It  is  important  to  track  reliability  throughout  the  test  program. 
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This  is  accomplished  by  assessment  of  system  reliability  at  the  end  of  each  test  phase  and 
comparing  the  current  reliability  and  the  planned  reliability.  Planning  and  tracking  of 
reliability  growth  requires  the  use  of  mathematical  models. 

One  mathematical  model  used  for  developing  an  idealized  growth  curve  in 
reliability  growth  planning  is  Duane’s  model.  A  second  mathematical  model  used  for 
tracking  reliability  growth  is  the  Non-Homogeneous-Poisson-Process  (NHPP)  model 
known  as  the  AMSAA  model  [Ref.  16].  In  contrast  to  the  constant  failure  rate  model 
used  in  reliability  prediction,  the  AMSAA  model  describes  the  failure  rate  of  the  system 
as  a  function  of  time. 

B.  RELIABILITY  GROWTH  PROGRAM  OL  THE  COMBAT  SYSTEM 


The  development  of  a  large  combat  system  generally  involves  years  of  design, 
testing,  fault  diagnosis,  and  redesign  to  assure  that  when  the  system  development  is 
completed,  the  final  system  meets  or  exceeds  the  user  requirements.  Reliability  Growth 
Testing  (RGT)  was  implemented  on  one  unit  of  the  prototype  as  part  of  the  Reliability 
and  Maintainability  (R&M)  program.  The  combat  system  consists  of  two  major 
subsystems  which  are  known  as  subsystem  A  and  B.  The  ultimate  goal  of  the  combat 
system  reliability  growth  testing  program  is  to  achieve  the  stated  reliability  requirements 
for  both  subsystems. 

The  reliability  growth  program  for  the  combat  system  focuses  on  the  following 

areas: 

Reliability  growth  planning:  To  develop  an  achievable  solution  based  on 
available  resources  and  schedule  constraints. 

Test-Analysis-And-Fix  (TAAF):  Failure  causes  are  isolated,  analyzed  and 

then  fixed. 

Reliability  growth  tracking:  To  determine  if  reliability  requirements  have 
been  demonstrated. 
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The  RGT  is  to  subject  a  single  unit  of  prototype  under  actual  field  conditions.  All 
the  failures  that  were  surfaced  during  the  test  were  analyzed  and  fixed  and  re-tested.  As 
the  testing  progresses,  these  fixes  are  incorporated  into  the  prototype  so  that  reliability 
will  improve  during  the  course  of  the  test. 

Major  development  efforts  were  mainly  focused  on  subsystem  A  as  it  involves  the 
integration  of  many  important  subsystems.  The  testing  for  the  subsystem  A  was  planned 
over  three  phases.  The  first  phase  is  considered  as  a  pre-development  testing  to  estimate 
the  initial  reliability  of  the  prototype  in  order  to  gauge  the  amount  of  development  efforts 
required  to  meet  the  target  reliability  goals.  The  additional  two  phases  focuses  on  meeting 
the  final  reliability  target.  Reliability  testing  for  the  subsystem  B  was  also  planned  over 
three  phases. 


C.  RELIABILITY  GROWTH  PLANNING  METHODOLOGY 


The  first  step  in  the  reliability  growth  process  is  reliability  growth  planning. 
Reliability  growth  planning  involves  the  development  of  an  idealized  growth  curve.  The 
major  role  of  the  idealized  reliability  growth  curve  is  to  quantify  the  overall  development 
efforts  so  that  the  growth  pattern  can  be  evaluated  relative  to  the  basic  objectives  and 
resources.  It  also  provides  the  program  manager  with  a  useful  tool  to  monitor  the 
reliability  growth  of  the  weapon  system  during  its  development. 

The  reliability  of  a  system  under  development  is  generally  increasing  rapidly  at 
the  beginning  and  slows  down  towards  the  end.  The  idealized  growth  curve  shown  in 
Figure  5  depicts  reliability  growth  as  a  smooth  non-decreasing  concave  down  curve  with 
respect  to  time.  A  typical  reliability  test  program  consists  of  several  test  phases.  Fitting  a 
smooth  curve  to  the  proposed  reliability  values  of  the  system  at  the  end  of  each  test 
phase,  the  resulting  curve  represents  the  overall  pattern  for  reliability  growth. 
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Figure  5.  Example  of  an  idealized  growth  curve 


The  formula  for  developing  the  idealized  growth  curve  is  based  on  Duane’s  model 
[Ref.  5]. 


M(t)  = 


Where 


M{ 


Mi 


J 


(1-  a) 


-i 


0  <  t  <  Tj 


t  >  Ti 


(3.1) 


MF  =  Desired  MTBF  value  at  T 

t  =  Cumulative  test  time 

tj  =  Cumulative  test  time  at  starting  point 

Mj  =  Average  initial  MTBF  of  the  system  at  the  beginning 
a  =  System  reliability  growth  rate  between  0  and  1 .0 
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The  development  of  the  idealized  growth  curve  starts  with  the  determination  of 
the  length  of  the  initial  test  phase  (t7)  and  the  average  MTBF  (M7)  over  this  initial  test 

phase.  The  success  in  the  development  of  the  idealized  growth  curve  depends  on  the 
ability  to  accurately  estimate  these  parameters  as  they  will  affect  the  total  test  time  and 
growth  rate  required  to  achieve  the  require  reliability.  A  system  with  a  lower  initial 
average  MTBF  will  require  longer  test  time  given  a  fix  growth  rate. 

There  is  no  standard  way  of  determining  the  values  of  these  parameters.  The 
average  initial  MTBF  of  the  system,  M7 ,  is  the  average  MTBF  over  the  initial  test  phase 
before  any  modification  is  developed,  implemented  or  tested.  The  practice  of  arbitrarily 
choosing  a  starting  point,  such  as  10%  of  the  requirement  is  not  recommended  [Ref.  5]. 
One  way  of  accurately  detennining  these  parameters  is  to  perfonn  an  initial  test  so  that 
Mj  and  Tj  are  known.  The  initial  test  phase  of  the  RGT  program  is  conducted  to 
“stabilize”  the  test  data,  so  it  must  be  long  enough  for  the  first  failure  mode  to  surface. 


The  value  M F  represents  the  desired  MTBF  at  the  end  of  the  reliability  growth 
test.  The  total  amount  of  testing,  T,  is  determined  through  a  joint  effort  between  the 
contractor  and  the  program  manager  and  it  is  derived  based  on  considerations  on 
available  resources,  and  calendar  time,  as  well  as  the  number  of  prototypes  available. 

The  growth  rate  of  the  system  ( a  )  determines  the  length  of  time  needed  to  grow 
from  the  initial  MTBF  to  the  required  MTBF.  The  growth  rate  gives  an  indication  on  how 
fast  the  system  reliability  is  improving.  The  growth  rate  is  governed  by  the  efficiency  by 
which  failures  are  corrected.  A  large  growth  rate  ( a  >0.5)  reflects  an  aggressive 
reliability  program  while  a  low  value  of  growth  rate  ( a  <  0.1)  reflects  a  program  where 
no  quick  fixes  are  available. 

For  fixed  values  of  T,  M  F ,  M  7  and  tn  the  value  of  a  may  be  approximated  by 
solving  the  following  equation: 
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This  is  a  reasonably  good  approximation  when  a  is  less  than  0.4  [Ref.  16].  The 
more  precise  way  to  solve  for  the  value  of  a  in  equation  3.1  is  by  using  numerical 
methods,  e.g.  with  MsExcel. 


1.  Development  of  the  Idealized  Growth  Curve  for  Subsystem  A 


The  development  of  the  idealized  growth  curve  is  based  on  the  initial  estimate  of 
the  MTBF  and  the  limitations  constrained  on  testing  such  as  number  of  units  under  test, 
resources  and  time  available  for  testing.  For  subsystem  A,  the  parameters  for  constructing 
the  idealized  growth  curve  are  based  on  the  given  mission  conditions. 

A  mission  reliability  of  200  rounds  Mean-Rounds-Between-Failure  (MRBF)  was 
required  at  the  end  of  the  reliability  growth  test.  Since  this  is  a  combat  system,  the  total 
test  time  for  this  subsystem  is  expressed  in  tenns  of  number  of  rounds  instead  of  calendar 
time  and  it  was  limited  to  a  maximum  of  2300  rounds  due  to  resource  constraints 
available  in  the  development  program. 

Average  initial  MRBF,  (M7).  The  initial  MRBF  is  detennined  based  on  pre- 
developmental  testing  of  the  proposed  system.  The  pre-development  testing  resulted  in  4 
mission  affecting  failures  in  280  rounds.  The  MRBF  was  projected  to  be  constant  during 
this  initial  testing  because  no  significant  design  changes  were  incorporated  during  the 
test,  so  the  MRBF  was  estimated  as: 

280 

Initial  MRBF  = - =  70  rounds  (3.3) 

4 


Growth  rate,  (a).  The  initial  MRBF  is  estimated  to  be  70  rounds  and  a  final 
MRBF  of  200  rounds  is  desired  after  2300  rounds  of  testing.  For  this  program,  the  first 
test  phase  is  280  rounds.  The  desired  growth  rate  parameter  can  be  determined  from  the 
following  equation. 
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The  growth  rate,  a ,  is  found  to  be  0.32  for  the  given  conditions,  anything  less 
would  violate  the  resource  constraints.  The  approximation  of  the  a  value  of  0.32  in 
equation  3.4  is  consistent  with  the  results  of  the  numerical  method.  An  a  value  of  0.32 
indicates  a  relatively  aggressive  development  program  that  would  require  emphasis  on 
the  analysis  and  fixing  of  problem  failure  modes  [Ref.  16].  Since  major  development 
efforts  will  be  focused  on  the  subsystem  A,  an  a  value  of  0.32  is  reasonable.  The  total 
test  time  is  sensitive  to  the  parameter  a  .  As  shown  in  table  4,  using  a  test  time  of  less 
than  2300  rounds  would  result  in  a  projected  a  greater  than  0.32  which  means  that  it  will 
require  an  even  more  aggressive  reliability  growth  program. 
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1280 

1080 
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Sensitivity  analysis  of  a  on  total  test  duration 


The  test  parameters  of  the  reliability  growth  plan  are  summarized  in  Table  5 

below. 


Subsystem  A 

Reliability  Target 
(MRBF) 

200  rounds 

Total  Test  Duration 

2300  rounds 

Growth  Rate 

0.32 

Table  5.  Summary  of  test  parameters  for  subsystem  A 


The  plan  assumed  that  the  MRBF  of  subsystem  A  would  grow  from  its  initial 
level  to  the  required  200  rounds  MRBF  in  accordance  to  the  following  form  of  Duane’s 
expression  for  reliability  growth: 


M(t) 


70 

70 


7  t  \032 


280 


(1-0.32)  ' 


0<t<280 
t  >  280 


(3.5) 
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MRBF 


T 

M(t) 

0 

70 

280  (-) 

70 

280  (+) 

103 

380 

114 

500 

124 

650 

135 

800 

144 

950 

152 

1100 

159 

1250 

166 

1400 

172 

1550 

178 

1700 

183 

1850 

188 

2000 

193 

2300 

202 

Table  6.  Computed  MRBF  values  for  the  idealized  growth  curve 


Subsystem  A  RDGT  Idealized  Growth  Curve 


Figure  6.  Idealized  growth  curve  for  subsystem  A 
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The  idealized  growth  curve  for  subsystem  A  is  shown  in  Figure  6.  The  reliability 
growth  test  of  subsystem  A  consists  of  two  additional  test  phases  at  280-1180  rounds  and 
1100-2300  rounds.  More  test  time  is  being  allocated  to  the  last  test  phase  as  more  time 
will  be  required  to  verify  the  effectiveness  of  the  previous  fixes. 


2.  Development  of  the  Idealized  Growth  Curve  for  Subsystem  B 

The  approach  taken  for  developing  the  idealized  growth  curve  for  subsystem  B  is 
similar  to  that  for  the  subsystem  A.  A  mission  reliability  of  350  kilometers  Mean- 
Kilometers-Between-  Failure  (MKBF)  was  required  at  the  end  of  the  reliability  growth 
test.  The  total  test  time  for  the  subsystem  is  expressed  in  tenns  of  kilometers. 

Average  initial  MKBF,  (Mf).  The  initial  MKBF  was  estimated  during  the 
prototype  run-in  test.  The  run-in  test  resulted  in  6  mission  affecting  failures  in  1000 
kilometers.  The  MKBF  was  projected  to  be  constant  during  this  initial  testing  because  no 
significant  design  changes  were  incorporated  during  the  test,  so  the  MKBF  was  estimated 
as: 

Initial  MKBF  =  1^2.  =  167  kilometers  (3.6) 

6 

Growth  rate,  (a)  and  total  test  time.  The  initial  MKBF  is  estimated  to  be  167 
kilometers  and  a  final  MKBF  of  350  kilometers  is  desired  at  the  end  of  the  testing.  Table 
7  shows  the  various  growth  rates  and  the  corresponding  total  test  time  computed  based  on 
the  initial  MKBF  and  test  time. 


M 

I 

167 

T 

I 

1000 

a 

0.25 

0.26 

0.27 

0.28 

0.29 

0.3 

0.32 

0.34 

0.36 

0.38 

T  (km) 

6150 

5450 

4850 

4350 

3950 

3600 

3050 

2600 

2270 

2000 

Table  7. 


Growth  rate  versus  total  test  duration 
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Based  on  schedule  constraints,  the  maximum  allowable  test  time  was 
approximately  4850  kilometers.  The  corresponding  desired  growth  rate  is  0.27.  The 
approximation  of  the  a  value  of  0.27  in  equation  3.4  is  consistent  with  the  results  of  the 
numerical  method. 


Subsystem  B 

Reliability  Target 
(MKBF) 

350  kilometers 

Total  Test  Duration 

4850  kilometers 

Growth  Rate 

0.27 

Table  8.  Summary  of  test  parameters  for  subsystem  B 


The  plan  assumed  that  the  MKBF  would  grow  from  its  initial  level  to  the  required 
350  kilometers  MKBF  in  accordance  to  the  following  form  of  Duane’s  expression  for 

f 167  0<t<1000l 


reliability  growth  M(t ) 


167 


f  t 

v1000 


,0.27 


(1-0.27)  1  t>1000 


(3.7) 


T 

M(t) 

0 

167 

1000  (-) 

167 

1000  (+) 

229 

1300 

246 

1600 

260 

1900 

272 

2200 

283 

2500 

293 

2800 

302 

3100 

310 

3400 

318 

3700 

326 

4000 

333 

4300 

339 

4600 

345 

4850 

350 

Table  9.  Computed  MKBF  values  for  the  idealized  growth  curve 
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Subsystem  B  RDGT  Idealized  Growth  Curve 
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Figure  7.  Idealized  growth  curve  for  the  subsystem  B 

The  idealized  growth  curve  for  subsystem  B  is  shown  in  Figure  7.  Similarly,  the 
reliability  growth  test  of  the  chassis  consists  of  two  additional  test  phases  at  1000-2600 
kilometers  and  2600-4850  kilometers.  More  test  time  is  being  allocated  to  the  last  test 
phase  as  more  time  will  be  required  to  verify  the  effectiveness  of  the  previous  fixes.  As 
compared  to  subsystem  A,  a  lower  a  value  of  0.27  for  subsystem  B  is  reasonable  since  it 
is  an  Off-The-Shelf  (OTS)  system. 
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IV.  RESULTS  AND  DISCUSSION  OF  RELIABILITY  GROWTH 

ANALYSIS 


A.  INTRODUCTION 


This  chapter  first  introduces  the  reliability  growth  models  used  for  the  reliability 
growth  analysis  of  the  combat  system  and  follows  with  the  results  and  discussion.  The 
objectives  of  the  reliability  growth  analysis  include: 

1 .  Estimating  the  demonstrated  MRBF  and  MKBF  of  the  two  subsystems  at 
the  end  of  each  test  phase. 

2.  Projecting  the  MRBF  and  MKBF  of  the  subsystems  if  delayed  fixes  were 
incorporated  at  the  end  of  a  test  phase. 

3.  Generate  reliability  growth  plots  (e.g.  MTBF  vs  Time)  to  determine  if 
reliability  is  improving,  decreasing  or  constant. 

The  demonstrated  MRBF  or  MKBF  provide  an  estimate  for  the  system 
configuration  on  test  at  the  end  of  a  test  phase.  This  value  is  determined  by  analysis  of  the 
test  results  using  AMSAA  reliability  growth  models.  The  demonstrated  value  is  then 
compared  to  the  idealized  growth  curve  at  the  end  of  each  test  phase  to  detennine  if 
reliability  growth  is  progressing  satisfactorily. 

A  projected  reliability  value  is  an  estimation  of  the  increased  in  system  reliability 
by  taking  into  account  the  effect  of  delay  fixes. 

Equations  4.1  to  4.24  in  the  following  sections  are  taken  from  Reference  20,  and 
were  fonnulated  by  Dr.  Farry  Crow. 


B.  AMSAA  RELIABILITY  GROWTH  MODELS 


There  are  three  types  of  AMSAA  growth  models  used  for  reliability  growth 
analysis.  They  are  1)  AMSAA  Basic  Model  for  Test-Fix-Test  2)  AMSAA  Projection 
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Model  for  Test-Find-Test  and  3)  AMSAA  Extended  Model  for  Test-Fix-Find-Test  [Ref. 
20],  The  distinction  between  these  three  models  is  when  fixes  are  incorporated  into  the 
system. 

The  test-fix-test  model  is  employed  when  a  corrective  action  is  immediately 
found  for  a  failure  mode  and  is  incorporated  into  the  system,  which  is  then  retested  to 
verify  its  effectiveness  and  to  surface  new  failure  modes.  This  model  estimates  the 
achieved  reliability  of  the  system  after  all  fixes  have  been  incorporated  into  the  system 
before  the  end  of  a  test  phase.  However,  it  cannot  estimate  the  increased  in  reliability  due 
to  delayed  fixes  that  were  incorporated  at  the  end  of  a  test. 

The  test-find-test  model  is  employed  when  corrective  actions  for  all  surfaced 
failure  modes  are  incorporated  into  the  system  at  the  end  of  the  test.  These  corrective 
actions  results  in  a  distinct  jump  in  system  reliability.  This  model  estimates  the  jump  in 
reliability  due  to  delayed  fixes. 

The  test- fix-find-test  model  is  a  combination  of  the  two  types  discussed  above.  It 
is  employed  when  some  corrective  actions  are  incorporated  into  the  system  during  the 
test  while  some  are  delayed  until  the  end  of  the  test. 

It  is  important  to  note  that  the  choice  of  model  for  analysis  should  not  be 
determined  by  the  data  but  rather  a  realistic  assessment  of  the  test  program’s  corrective 
actions. 


1.  AMSAA  Basic  Model  for  Test-Fix-Test 

The  AMSAA  model  employs  the  Weibull  process  to  model  reliability  growth 
during  a  developmental  phase.  This  model  was  formulated  by  Dr.  Larry  Crow  and  it  is 
frequently  used  on  systems  when  usage  is  measured  on  a  continuous  scale.  It  is  also 
designed  for  tracking  reliability  within  a  test  phase  and  not  across  test  phases  [Ref.  19]. 
The  test-fix-test  model  evaluates  reliability  growth  that  results  from  the  introduction  of 
design  fixes  into  the  system  during  a  particular  test  phase. 
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The  AMSAA  test-fix-test  model  assumes  that  system  failures  during  a 
development  testing  phase  follow  the  non-homogeneous  Poisson  Process  (NHPP)  with  a 
Weibull  intensity  function  of  the  following  fonn: 

r(t)  =  (4.1) 

t  =  cumulative  test  time 

A  =  the  scale  parameter.  It  depends  on  the  unit  of  measurement  chosen  for  t 

P  =  the  shape  parameter  (also  known  as  the  growth  parameter)  because  it 

characterize  the  shape  of  the  graph  of  the  intensity  function 

The  relationship  between  the  growth  rate  and  shape  parameter  is  given  as: 

a DUANE  P AMSAA  (4-2) 

Suppose  development  testing  for  a  particular  test  phase  stops  at  time  T  and  no 
further  improvements  are  being  made  into  the  system.  In  other  words,  the  system 
configuration  is  fixed  after  time  T.  The  demonstrated  or  achieved  failure  intensity  is 

iCA=r(T)  =  XfJTP-\  (4.3) 


The  demonstrated  instantaneous  MTBF  at  the  end  of  the  test  phase  after  T  units  of 
testing  is  given  as  the  reciprocal  of  the  intensity  function: 


Mca  = 


k J3T 


p- 1 


(4.4) 


P  is  a  very  important  parameter  as  it  indicates  whether  there  is  reliability  growth 
during  the  development  process.  Three  possible  conditions  are  reflected  by  the  value  of  P 

P  <  1 :  Positive  reliability  growth  because  failure  rate  is  decreasing 

P  =  1 :  The  constant  case.  No  reliability  growth  because  failure  rate  is  constant 

P  >  1 :  Negative  reliability  growth  because  failure  rate  is  increasing  with  time 
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If  the  testing  were  stopped  at  time  T  and  significant  modifications  are  made  on  the 
system,  there  may  be  a  jump  in  system’s  reliability.  However,  the  AMSAA  test-fix-test 
model  does  not  estimate  the  jump  due  to  these  delayed  fixes. 

Parameters  Estimation  using  method  of  Maximum  Likelihood 

Estimates  of  the  two  parameters  [S  and  A  are  made  using  on  the  method  of 
maximum  likelihood  in  the  MIL-HDBK-189  [Ref.5].  They  are  estimated  based  on  times 
to  failure  data  which  has  been  accumulated  during  a  given  test  phase.  It  is  important  then 
to  collect  the  actual  times  to  failure  and  total  test  time  during  development  testing.  The 
estimate  of  the  shape  parameter  (5 ,  is  given  by 


N 


NlnT-^lnXi 

i= 1 


N  =  Total  number  of  failure  occurrences 
T  =  Total  accumulated  test  time 


(4.5) 


X;  =  Cumulative  test  time  at  which  the  ith  failure  occurred 
The  scale  parameter  is  given  by 


(4.6) 


Cramer- Von  Mises  Goodness  of  Fit  Test 


Next,  the  Cramer-Von  Mises  goodness  of  fit  test  is  performed  to  determine  if 
there  is  enough  information  to  reject  the  hypothesis  that  the  reliability  growth  process  can 
be  described  by  the  AMSAA  model  [Ref.5].  The  Cramer-Von  Mises  statistics  is  given  by 
the  following  expression: 


C  = 

'-M 


1 


M 

y 

12  Mtr 


“1 2 


2/  —  1 
2  M 


(4.7) 


M  =  number  of  failures 
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If  the  statistics  CM 2  exceeds  the  critical  value  corresponding  to  M  for  a  chosen 

significant  level,  then  the  null  hypothesis  that  the  AMSAA  model  adequately  described 
the  growth  process  shall  be  rejected.  Otherwise,  the  model  shall  be  accepted. 

2.  AMSAA  Projection  Model  for  Test-Find-Test 

The  AMSAA  Projection  Model  for  Test-Find-Test  classifies  all  failure  modes  into 
two  groups  [Ref.20]: 

Type  A  failure  modes.  No  corrective  actions  will  be  taken  for  Type  A  failure 
modes.  Type  A  failure  mode  has  a  constant  failure  intensity,  XA  . 

Type  B  failure  modes.  Failure  modes  whose  corrective  actions  will  only  be  taken 
at  the  end  of  the  test. 

For  the  test- find-test  model,  the  system  failure  intensity  is  constant  ( /?  =  1 )  during 
the  test  because  no  corrective  actions  are  incorporated  into  the  system.  The  system  then 
experiences  a  jump  in  reliability  after  the  incorporation  of  delayed  fixes.  The  achieved 
system  failure  rate  Xs ,  prior  to  the  delay  fixes  can  be  estimated  as  follows: 


XA  +  XB 

(4.8) 

Total  number  of  Type  A  failures 

na 

(4.9) 

Total  test  time 

T 

Total  number  of  Type  B  failures 

nb 

(4.10) 

Total  test  time 

T 

The  projected  failure  intensity  after  the  incorporation  of  delayed  fixes  is  obtained 
by  assigning  an  effectiveness  factor  (EF)  d .  to  every  individual  unique  Type  B  failure 

modes.  The  assigned  effectiveness  factor  based  on  engineering  assessment  results  in  a 
fractional  decrease  in  the  failure  rate  A  of  the  j-th  Type  B  failure  mode  after  fixes  have 

been  incorporated.  The  total  number  of  Type  B  failures  observed  during  a  test  is 
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(4.11) 


M 

N,=Y,NJ 

j= i 

M  =  Total  number  of  unique  Type  B  failure  modes  and 
Nj  =  Total  number  of  failures  for  the  j-th  observed  distinct  Type  B  mode. 

The  projected  failure  intensity  is  then  computed  as  follows: 

4  =r, +£(!-<*;  )-f+<rt(r)  (4.12) 

j=\  1 


d  =  Average  EF= 


H_ _ 

M 


(4.13) 


h(T)  =  XjdTp-x 


(4.14) 


X  and  [3  are  calculated  using  equation  4.5  and  4.6  based  only  on  the  M  first 
occurrence  failure  times  of  the  seen  and  unique  Type  B  failure  modes  [Ref.  20]. 

The  objective  of  this  model  is  to  estimate  the  jump  in  MTBF  which  is  the  inverse 
of  the  projected  failure  intensity  given  by 


Mp 


(4.15) 


3.  Extended  Reliability  Growth  Model  for  Test-Fix-Find-Test 

The  Extended  Model  utilizes  A,  BC  and  BD  failure  mode  classification  to  analyze 
reliability  growth  projection  data  [Ref.  20]. 

Type  BD  failure  modes.  Corrective  actions  for  Type  BD  failure  modes  are 
delayed  till  the  end  of  the  test.  They  are  the  same  as  Type  B  failure  modes  in  the  test- 
find-test  model. 

Type  BC  failure  modes.  Corrective  actions  for  Type  BC  failure  modes  are 
incorporated  during  the  test. 
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Type  A  failure  modes,  as  before,  are  those  that  will  not  receive  any  correction 
actions.  Type  A  and  Type  BD  failure  modes  do  not  contribute  to  reliability  growth  during 
the  test.  The  growth  in  reliability  during  the  test  is  affected  only  by  corrective  actions  for 
Type  BC  failure  modes.  The  objective  of  this  model  is  to  estimate  the  increased  in 
reliability  due  to  the  corrective  actions  for  Type  BD  failure  modes  at  the  end  of  the  test. 

The  projected  failure  intensity  after  the  incorporation  of  delayed  fixes  into  the 
system  for  the  Extended  Model  is 

M  _ 

4, =4 -4+ E(1-<tfr+rf  ht/bd)  <4.i6) 

j= i  1 


The  first  term  XCA  is  the  failure  rate  prior  to  delay  fixes.  It  is  the  same  as  equation 

4.3  applied  to  all  A,  BC  and  BD  failure  modes.  The  remaining  terms  are  calculated  in  the 
same  manner  as  the  AMSAA  Test-Find-Test  model  using  only  data  for  BD  failure 
modes. 


Finally  the  projected  MTBF  after  the  incorporation  of  delayed  fixes  into  the 
system  for  the  Extended  Model  is  the  inverse  of  the  failure  intensity  given  by 


M EM 


x„ 


(4.17) 


In  addition,  the  AMSAA  Extended  Test-Fix-Find-Test  Model  can  be  modified  to 
analyze  test-fix-test  data  and  test-find-test  data  by  designating  failure  modes  as  BC  and 
BD  respectively. 
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c. 


RELIABILITY  GROWTH  ANALYSIS  LOR  SUBSYSTEM  A 


Reliability  growth  for  subsystem  A  is  tracked  over  three  phases  of  testing. 
Reliability  is  tracked  on  a  phase  by  phase  basis  using  test  data  collected  within  each  test 
phase.  The  ReliaSoft’s  RGA  6  PRO  software  is  used  for  analyzing  the  collected  data  and 
generating  reliability  growth  plots  [Ref.  21].  The  type  of  reliability  growth  model 
selected  for  must  be  based  on  the  type  of  management  approach  employed  in  each  test 
phase: 

Phase  1 :  AMSAA  Extended  Test-Find-Test  Projection  Model 

Phase  2:  AMSAA  Extended  Model  for  Test-Fix-Test 

Phase  3:  AMSAA  Extended  Model  for  Test-Fix-Find-Test 

The  AMSAA  Basic  Model  for  Test-Fix-Test  does  not  utilize  any  failure  mode 
designation  but  the  AMSAA  Extended  Model  does.  Specific  knowledge  on  Type  BC  and 
Type  BD  failure  mode  can  help  to  generate  useful  metrics  for  decision  making  and 
engineering  purposes  [Ref.  20].  The  AMSAA  Extended  Models  are  used  to  analyze  both 
test-find  test  and  test-fix-test  data  by  setting  all  failure  modes  to  BC  and  BD  respectively. 
The  underlying  mathematical  principles  of  the  AMSAA  Basic  Test-Fix-Test  Model  and 
AMSAA  Basic  Test-Find-Test  Model  remain  unchanged. 

1.  Phase  1  results  and  analysis 

In  Phase  1,  the  prototype  system  was  subjected  to  280  rounds  of  testing  according 
to  the  test  plan.  Since  this  test  phase  is  short,  fixes  are  not  incorporated  into  the  system 
during  the  test.  During  the  test,  three  failures  were  identified  but  all  corrective  actions 
were  delayed  till  the  end  of  the  test.  This  management  strategy  is  known  as  test-find-test. 
The  AMSAA  Extended  Test-Find-Test  Projection  Model  is  selected  to  analyze  the 
reliability  of  the  system  after  the  incorporation  of  delayed  fixes.  All  failure  modes 
identified  during  the  test  will  receive  a  delay  corrective  action  therefore  all  failures  are 
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being  classified  as  Type  BD.  These  failures  were  also  assigned  a  failure  category 
according  to  their  root  cause  as  shown  in  Table  10. 


j 

Time  to  Event, 

x.i 

Classification 

Mode 

Failure  Category 

1 

21 

BD 

1 

Faulty  component 

2 

132 

BD 

2 

Design 

3 

215 

BD 

3 

Design 

Table  10.  1 

fest- find-test  data  for  Phase  1 

BD  Mode 

Number  of 
Failures,  TV. 

Time  to  First 
Occurrence 

EF,  dj 

1 

1 

21 

0.65 

2 

1 

132 

0.7 

3 

1 

215 

0.7 

ell.  Test-i 

find-test  Type  B  failure  mode  data  and  ] 

EF  for  Phase  1 

Table  1 1  shows  the  frequency  and  the  assigned  effectiveness  factor  (EF)  for  each 
Type  BD  failure  mode.  The  EF  is  an  engineering  estimate  based  on  the  probability  that 
the  fix  is  effective  in  mitigating  or  reducing  the  probability  of  occurrence  for  the 
particular  failure  mode.  An  EF  of  1.0  is  not  practical  in  most  cases  since  a  fix  will 
unlikely  be  able  to  completely  eliminate  a  failure  mode.  Studies  have  shown  that  an 
average  effectiveness  factor  of  0.7  is  reasonable  for  a  typical  reliability  growth  program. 
[Ref.  20]  Failure  Mode  Type  BD1  was  assigned  a  lower  EF  due  to  high  uncertainty 
associated  with  the  effectiveness  of  the  correction  action. 


Since  the  test  data  consists  of  only  Type  BD  failure  modes,  the  achieved  system 
failure  intensity  can  be  estimated  by  equation  4.8. 


X.  =  Xa 


N  3 

— —  =  — —  =  0.0107 
T  280 


The  estimated  achieved  MRBF  at  T=280  rounds  before  the  jump  is  the  inverse  of 
the  achieved  system  failure  intensity. 

=  —  =  93.3  rounds 

A 
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Next,  the  projected  failure  intensity  due  to  the  delay  fixes  is  calculated  using 
equation  4.12. 


m  N  _ 

+  =  +  +£(1 -d,)~r+dh(T) 

j= 1  1 

The  average  EF  of  the  delay  fixes  is  given  in  equation  4.13 


d  =  Average  EF= 


2>, 

]= i _ 

M 


0.65  +  0.7  +  0.7 
3 


0.683 


The  term/i(r / BD)  =  XjdTp  1  from  equation  4.14  is  a  function  of  /?  and  X  .  These 
two  parameters  are  estimated  from  equations  4.5  and  4.6  using  first  occurrence  data  from 
Table  11. 


P  = 


N 


N\nT-±\*X,  [31n280-(ln21  +  lnl32  +  ln21S)] 


=  0.8319 


;=i 


N  3 

A  =  —  =  — -7—-  =  0.0276 
tp  2800'831 


h(280/BD)  =  0.0089 


This  metric  h(T  /  BD)  represents  the  intensity  for  Type  BD  failure  modes  that 
have  not  been  seen  during  the  testing  which  also  means  the  rate  at  which  new  distinct 
Type  BD  modes  are  occurring  at  the  end  of  the  test. 

With  all  the  above  parameters  defined,  the  projected  failure  intensity  can  be 
calculated. 


i 


p 


~  M  N  —/v  3  N 

++Za-rf/)V-+rf*u)  =  Z(  i  -  dj) — —  +  0.683  *  0.0089  =0.00952 

j=\  T  y=i  280 


The  projected  MRBF  due  to  the  jump  is  the  inverse  of  the  project 


M 


P 


—  =  105  rounds 

K 
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For  a  two-sided  confidence  level  of  90  %,  the  projected  MRBF  is  between  40  and 
278  rounds. 


Projection  Summary 

P : 

0.8319 

PMTBF: 

105.4 

i: 

0.0276 

DMTBF: 

93.33 

Statistical  Results 

Result 

Test  Value 

Upper 

Cram'er  Von 
Mises  (BD) 

Passed 

0.059 

0.154 

Table  12.  RGA  6  PRO  projection  summary  and  Cramer  Von  Mises  test  results  for  Phase  1 


The  RGA  6  PRO  generated  results  as  shown  in  Table  12  is  similar  to  the  hand 
calculated  values.  The  Cramer-Von  Mises  statistics  of  0.059  is  below  the  critical  value  of 
0.154  for  a  significance  level  of  0.1.  Hence  the  hypothesis  that  the  AMSAA  model  is 
applicable  is  accepted. 

Figure  8  shows  the  plot  of  reliability  versus  time  for  subsystem  A  during  the  test. 
The  MRBF  is  constant  ( /?  =  1 )  during  the  test  because  no  fixes  were  implemented  on  the 
system  and  thus  the  system  failure  rate  remains  constant  during  the  test.  There  is  a  jump 
in  reliability  at  the  end  of  the  test  due  to  fixes  being  incorporated  into  the  system.  The 
projection  model  estimates  that  the  system  MRBF  jumps  to  a  value  of  105  rounds  due  to 
three  distinct  corrective  actions  with  the  corresponding  EF  stated  in  Table  11.  The 
estimated  MRBF  of  106  rounds  after  the  incorporation  of  fixes  has  exceeded  the  planned 
target  of  104  rounds  at  the  end  of  Phase  1. 
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Figure  8.  MRBF  projection  for  Phase  1 


In  addition,  the  AMSAA  Extended  Test-Find-Test  Projection  Model  can  also  be 
used  to  estimate  the  fraction  of  seen  and  unseen  Type  BD  failure  intensity  at  the  end  of 
the  test.  The  intensity  for  Type  BD  failure  modes  that  have  been  seen  in  the  testing  can  be 
estimated  as  follows: 

XBD  -  h(2S0 /  BD)  =  0.0107-0.0089=0.0018  (4.18) 

The  fraction  of  Type  BD  failure  intensity  due  to  failure  modes  that  have  been  seen 
in  test  is  [Ref.  20]: 

Fraction  Seen  =  =0.168  (4.19) 

A BD 

The  fraction  of  Type  BD  failure  intensity  due  to  failure  modes  that  have  not  been 
seen  in  test  is  [Ref.  20]: 
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Fraction  Unseen=l- 


/?(280  /  BD) 

^ BD 


=0.831 


(4.20) 


Figure  9  displays  the  failure  rate  for  each  Type  BD  failure  modes  before  and  after 
implementing  the  fixes.  It  provides  a  clear  visibility  on  the  failure  rate  breakdown  of  each 
individual  Type  BD  failure  mode  to  the  system’s  overall  failure  rate.  Failure  mode  BD1 
appears  to  have  the  highest  failure  rate  from  Figure  9  as  it  is  directly  dependent  on  the 
assumed  EF.  In  this  case,  the  EF  for  failure  mode  BD1  has  been  assumed  a  lower  value 
as  compared  to  BD2  and  BD3  and  should  be  the  main  focus  in  the  failure  management 
strategy.  The  ability  to  designate  failure  modes  has  certainly  provided  clearer 
management  and  engineering  insights  when  formulating  the  failure  management  strategy. 


BD1  BD2  BD3 

Figure  9.  Before  and  after  failure  rate  for  Type  BD  failure  modes  in  Phase  1  based  on 

frequency  and  EF 
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2. 


Phase  2  results  and  analysis 


The  testing  approach  used  in  Phase  2  is  Test- Analyze- And-Fix  (TAAF),  in  which 
fixes  for  all  failure  modes  discovered  are  being  incorporated  during  the  test.  The  system 
is  tested  for  820  rounds  in  this  phase.  The  AMSAA  Extended  Test-Fix-Test  Model 
designates  all  failures  as  Type  BC  as  shown  in  Table  13. 


j 

Time  to  Event, 

x.i 

Classification 

Mode 

Failure  Category 

1 

27 

BC 

1 

Faulty  component 

2 

72 

BC 

2 

Design 

3 

122 

BC 

2 

Design 

4 

265 

BC 

3 

Software 

5 

317 

BC 

4 

Design 

6 

394 

BC 

5 

Design 

7 

455 

BC 

2 

Design 

8 

719 

BC 

6 

Faulty  component 

Table  13.  Test- fix-test  data  for  Phase  2 


BC  Mode 

Number  of  Failures, 

Nj 

Time  to  First  Occurrence 

1 

1 

27 

2 

3 

72 

3 

2 

275 

4 

1 

317 

5 

1 

394 

6 

1 

719 

Table  14.  Unique  first  time  occurrence  BC  failure  mode  for  Phase  2 


During  Phase  2,  six  unique  Type  BC  failure  modes  were  observed  in  eight 
hundred  and  twenty  rounds  of  testing.  The  demonstrated  MRBF  calculations  will  be 
calculated  next. 


The  shape  parameter  is  estimated  using  equation  4.5 


N 

N 

AlnT-^lnA 

1=1 
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8 


[8  In  820  -  (In  27  +  In  72  +  In  122  +  In  265  +  In  3 1 7  +  In  394  +  In  455  +  In  7 1 9)] 
=  0.7089 


The  calculated  j3  of  0.7089  (y 3  <1)  implies  positive  and  improved  reliability 
growth  in  this  phase. 

The  relationship  between  the  growth  rate  and  the  shape  parameter  is  given  by 
equation  4.2. 

dvane  =  1  -  Pamsaa  =  1 "  0-™89  =  0.29 1 1 


The  calculated  growth  rate  of  0.291 1  is  close  to  but  falls  below  the  desired  value 
of  0.32.  It  implies  that  reliability  growth  is  not  growing  as  fast  as  it  was  planned  to  be. 

The  scale  parameter  is  estimated  using  equation  4.6. 


i  = 


N 


8 

8200'7089 


0.0687 


The  achieved  failure  intensity  is  given  by  equation  4.3. 

=  r(T )  =  =  0.0687  *  0.7089  *  820°-708<M  =  0.0069 


The  demonstrated  instantaneous  MRBF  at  the  end  of  phase  2  after  820  rounds  of 
testing  is  the  reciprocal  of  the  intensity  function  given  by  equation  4.4. 


M 


CA 


- =  145  rounds 

Ka 


For  a  two-sided  confidence  level  of  90  %,  the  demonstrated  MRBF  is  between  63 
and  431  rounds. 


Another  useful  metric  that  can  be  detennine  from  the  test  data  is  the  initial  system 
MRBF  at  the  beginning  of  this  phase  [Ref.20]. 

r(i+|) 

Mj  = - - - =  54  rounds  (4.21) 

ft 
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The  initial  MRBF  of  54  rounds  at  the  beginning  of  Phase  2  falls  within  the 
confidence  interval  of  40  and  280  rounds  at  the  end  of  Phase  1 .  At  the  beginning  of  the 
test  it  is  estimated  that  the  initial  system  MRBF  was  54  rounds  and  due  to  six  distinct 
fixes  the  reliability  grew  to  145  rounds  at  the  end  of  820  rounds  of  test. 


Analysis  Summary 

Model: 

Crow-AMSAA  (NHPP) 

Analysis  Method: 

MLE 

P- 

0.7089 

Test  Procedure: 

Developmental 

A: 

0.0688 

Input  Type: 

Cumulative 

Growth  Rate: 

0.2911 

Tennination  Time: 

820 

Instant. 

MTBF: 

144.58 

Statistical  Results 

Result 

Test  Value 

Upper 

Cram'er  Von  Mises 

Passed 

0.035 

0.165 

Table  15.  RGA  6  PRO  analysis  summary  and  Cramer  Von  Mises  test  results  for  Phase  2 


The  RGA  6  PRO  generated  results  presented  in  Table  15  are  consistent  with  the 
hand  calculated  values.  The  Cramer-Von  Mises  statistics  of  0.035  is  below  the  critical 
value  of  0.165  for  a  significance  level  of  0.1.  Hence  the  hypothesis  that  the  AMSAA 
model  is  appropriate  is  accepted. 
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Instantaneous  MRBF  vs  Time 


Time  (Rounds) 


Figure  10.  MRBF  projection  for  Phase  2 


Figure  10  indicates  that  reliability  is  increasing  with  time.  The  effective 
application  of  the  TAAF  approach  in  surfacing  and  fixing  failure  modes  has  contributed 
to  reliability  growth  in  this  phase.  According  to  the  idealized  growth  curve,  the  expected 
MRBF  at  the  end  of  Phase  2  should  approach  159  rounds.  The  demonstrated  MRBF  of 
145  rounds  is  close  to  approaching  the  expected  target. 

However  one  main  concern  identified  in  this  phase  is  the  relative  high  frequency 
of  mode  BC2  as  shown  in  Figure  1 1 .  An  effective  failure  management  strategy  at  this 
point  of  the  program  should  focused  on  fixing  on  failure  mode  BC2  by  allocating  more 
resources  to  identify  its  root  cause  and  improve  on  current  corrective  actions. 
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3.  Phase  3  results  and  analysis 


In  Phase  3,  some  fixes  are  incorporated  into  the  system  during  the  test  while 
others  are  delayed  until  the  end  of  the  test.  The  reasons  for  the  delayed  fixes  are  due  to: 
1)  unavailability  of  spares  parts  or  tools  required  for  component  replacement  or  repair 
and  2)  inability  to  identify  the  root  cause  of  failure  during  the  test.  This  type  of  data  is  a 
combination  of  test-fix-test  and  test-find-test  which  is  known  as  test-fix-find-test.  The 
AMSAA  Extended  Test-Fix -Find-Test  Model  is  used  for  analyzing  the  data.  There  are 
nine  failures  observed  in  1200  rounds  of  testing.  The  failures  that  receive  a  correction 
action  during  the  test  are  classified  as  BC  while  those  that  are  delayed  will  be  classified 
as  BD.  All  the  failures  surfaced  in  this  phase  are  presented  in  Table  16. 
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j 

Time  to  Event, 
Xj 

Classification 

Mode 

Failure  Category 

1 

55 

BD 

1 

Faulty  component 

2 

101 

BC 

1 

Design 

3 

212 

BC 

1 

Software 

4 

317 

BC 

2 

Faulty  component 

5 

379 

BC 

3 

Software 

6 

465 

BC 

4 

Design 

7 

520 

BD 

2 

Faulty  component 

8 

579 

BD 

3 

Quality 

9 

900 

BC 

5 

Workmanship 

Table  16.  Test- fix-find- test  data  for  Phase  3 


BD  Mode 

Number  of 
Failures,  AT 

Time  to  First 
Occurrence 

EF,  dj 

1 

1 

55 

0.6 

2 

1 

520 

0.6 

3 

1 

579 

0.6 

Table  17. 


Test-find-test  Type  BD  failure  mode  data  and  EF  for  Phase  3 


There  are  six  unique  BC  failure  modes  and  three  unique  BD  failure  modes  in  this 
phase.  The  EF  for  all  BD  failure  modes  is  conservatively  assigned  as  0.6  because  this  the 
last  test  phase  with  no  further  testing  to  verify  their  effectiveness.  The  assigned  EF  will 
be  used  for  estimating  the  jump  in  the  system  reliability  due  to  the  delay  fixes. 

The  estimate  of  the  failure  intensity  after  1200  rounds  of  testing  before 
incorporation  of  delayed  fixes  is  estimated  using  equation  4.3. 

XcA=r{T)  =  XjiT^ 

The  shape  parameter  ft  is  calculated  using  equation  4.5  based  on  the  data  in 
Table  15 


N\nT~YJ\^Xi 

1=1 
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_ 9 _ 

[9  In  1 200  -  ( In  55  +  In  101  +  In  2 1 2  +  In  3 1 7  +  In  379  +  In  465  +  In  520  +  In  579  +  In  900)] 

=0.715 


The  calculated  ft  of  0.715  (/?<  1)  implies  positive  reliability  growth  in  this 

phase. 

The  growth  rate  is  given  by  equation  4.3 
a  DUANE  =  1  -  Pamsaa  =1-0.715  =  0.285 

The  calculated  growth  rate  of  0.285  is  consistent  with  that  of  Phase  2. 

The  scale  parameter  is  given  by  equation  4.6 


9 

1200°'715 


0.0563 


The  achieved  failure  intensity  before  the  incorporation  of  the  delay  fixes  at  a 
cumulative  time  of  1200  rounds  is 


Aca  =r(T)  =  =  0.0563*0. 715*1200oj15~1  =0.00533 

The  achieved  MRBF  is  the  inverse  of  the  failure  intensity  given  by 

~  r  ~  “I-1 

Mca=  Aca  =186  rounds 

For  a  two-sided  confidence  level  of  90  %,  the  demonstrated  MRBF  is  between  85 
and  512  rounds. 
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Analysis  Results 

P- 

0.715 

DMTBF: 

186.3 

i: 

0.0563 

Statistical  Results 

Result 

Test  Value 

Upper 

Cram'er  Von 
Mises 

Passed 

0.0995 

0.167 

Table  18.  RGA  6  PRO  failure  modes  analysis  results  and  Cramer  Von  Mises  statistical  test 

results  for  Phase  3 

The  RGA  6  PRO  generated  results  presented  in  Table  18  are  consistent  with  the 
hand  calculated  values.  The  Cramer-Von  Mises  statistics  of  0.0727  is  below  the  critical 
value  of  0.16  for  a  significance  level  of  0.1.  Hence  the  hypothesis  that  the  AMSAA 
model  is  appropriate  is  accepted. 

The  demonstrated  MRBF  of  186  rounds  at  the  end  of  Phase  3  prior  to  the 
incorporation  of  fixes  did  not  meet  the  requirement  of  200  rounds  because  the  achieved 
growth  rate  of  0.29  is  below  the  planned  value  of  0.32.  However  a  trend  of  decreasing 
number  failures  is  obvious  from  the  cumulative  number  of  failures  versus  time  plot  in 
Figure  12.  There  is  only  one  failure  observed  in  the  last  600  rounds  of  testing.  Figure  12 
also  shows  that  the  results  are  slightly  biased  as  the  number  of  failures  at  each  instant  of 
time  is  being  underestimated.  The  next  step  is  to  estimate  the  jump  in  reliability  as  a 
result  of  delayed  fixes. 
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Cum.  Number  of  Failures 


Cumulative  Number  of  Failures  vs  Time 


10.00 


1.00 


0.10 


Figure  12.  Cumulative  number  of  failures  vs  time  plot  for  Phase  3 
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Instantaneous  MRBF  vs  Time 
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Figure  13.  MRBF  vs  time  plot  for  Phase  3 


The  projected  failure  intensity  after  the  incorporation  of  delay  fixes  into  the 
system  is  calculated  using  equation  4.16. 

M  * 

K EM  =Ka  ~  ^BD  +  +  ^  h(T  /  BD ) 

7=1  1 

The  failure  intensity  for  Type  BD  failure  modes  is  given  by 

Total  number  of  Type  BD  failure  modes  3 

ABD  — - : - : - =  =  l).  1)025 


Total  test  time 


1200 


The  term  h(T  /  BD)  =  Xj3Tp  1  from  equation  4.14  is  a  function  of  /)  and  X. 
These  two  parameters  are  estimated  from  equations  4.5  and  4.6  using  first  occurrence 
data  from  Table  17. 
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p 


N 


NlnT-^lnX:  [31nl20°-(ln55  +  ln520  +  ln579)] 


=  0.645 


i=i 


The  growth  rate  due  to  the  three  distinct  fixes  is  given  by  equation  4.3 

a  duane  =  1 —  Pamsaa  =  1 —  0.645  —  0.355 


X  =  ^  =  0.0309  failures/round 

Tp  1200°'645 

h(T  /  BD)  =  Xj3Tpi  =0.001615 


This  metric  h(T  /  BD)  represents  the  intensity  for  Type  BD  failure  modes  that 
have  not  been  seen  during  the  testing  which  also  means  the  rate  at  which  new  distinct 
Type  BD  modes  are  occurring  at  the  end  of  the  test. 

The  average  EF  is  given  by  equation  4.13 


d  =  Average  EF= 


2A 

,/=i 

M 


0.6  +  0.6  +  0.6 

3 


=  0.6 


Finally,  the  projected  failure  intensity  after  the  incorporation  of  delayed  fixes  into 
the  system  for  the  Extended  Model  can  be  determined  as 


M  ^ 

A EM  =Ka  -  A BD  +  +  d  h(T / BD) 

j= i  1 


3  N 

=  0.00533  -  0.0025  +  Y(l-d) — ^+0.6*0.001615 

tr  4  1200 


=0.00483 

The  Extended  Model  projected  MRBF  after  the  incorporation  of  delay  fixes  at  the 
end  of  the  test  is  given  by  equation  4.17. 


M EM 


A 


'EM 


206.7 
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For  a  two-sided  confidence  level  of  90  %,  the  projected  MRBF  is  between  105 
and  404  rounds. 

A  sensitivity  analysis  on  the  effects  of  varying  EF  on  MRBF  was  carried  out  and 
the  results  shows  that  the  projected  MRBF  increases  to  220  rounds  or  by  7  percent  if  the 
assumed  EF  is  0.9.  It  can  therefore  be  concluded  that  the  resulting  MRBF  does  not 
varying  significantly  when  using  two  extreme  values  of  EF. 


Table  19. 


Analysis  Results 

P : 

0.6455 

PMTBF: 

206.7 

i: 

0.0309 

Statistical  Results 

Result 

Test  Value 

Upper 

Cram'er  Von 
Mises  (BD) 

Passed 

0.0872 

0.154 

RGA  6  PRO  BD  failure  modes  ana 


ysis  results  and  Cramer  Von  Mises  statistical 


test  results  for  Phase  3 


The  RGA  6  PRO  generated  results  presented  in  Table  19  are  consistent  with  the 
hand  calculated  values.  The  Cramer-Von  Mises  statistics  of  0.0872  is  below  the  critical 
value  of  0.154  for  a  significance  level  of  0.1.  Hence  the  hypothesis  that  the  AMSAA 
model  is  appropriate  is  accepted. 
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Figure  14.  Projected  MRBF  vs  time  plot  for  Phase  3 
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Figure  15.  Individual  failure  rate  of  Type  BC  and  BD  failure  modes  at  the  end  of  Phase  3 


The  Extended  Model  estimates  that  the  MRBF  grew  to  1 86  rounds  as  a  result  of 
three  corrective  actions  for  BC  failure  modes  during  the  test.  It  then  jumps  to  206.7 
rounds  as  a  result  of  the  delayed  corrective  actions  for  the  Type  BD  failure  modes  even 
with  conservative  EF  estimates  of  0.6.  The  system  is  estimated  to  meet  the  reliability 
requirements  after  taking  into  account  the  effect  of  delayed  fixes.  To  provide  additional 
insights,  Figure  15  shows  the  individual  failure  rate  contribution  of  both  Type  BC  and 
Type  BD  failure  modes.  In  comparison,  failure  mode  Type  BC1  has  the  highest  relative 
failure  rate.  On  the  other  hand,  the  failure  rates  of  Type  BD1,  BD2  and  BD3  has 
decreased  significantly  after  fixing.  To  substantiate  this  claim  from  an  engineering 
viewpoint,  the  fixes  for  these  three  BD  modes  involves  only  basic  component 
replacement  or  repair.  The  assigned  EF  of  0.6  is  a  conservative  estimate  for  simple  fixes 
and  hence  it  can  be  concluded  that  the  projected  MRBF  of  206  rounds  is  a  realistic 
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estimate.  In  order  for  the  reliability  of  the  system  to  grow  further,  efforts  should  be 
focused  on  improving  the  correction  action  for  failure  mode  BC1  despite  the  last 
corrective  action  has  proven  to  be  effective. 

The  system  grows  from  an  initial  demonstrated  MRBF  of  93  rounds  to  206 
rounds.  In  conclusion,  the  system  is  estimated  to  meet  the  reliability  requirements  at  the 
end  of  the  RGT.  However,  the  projected  MRBF  falls  between  105  and  404  rounds  for  a 
two-sided  confidence  level  of  90  %, 


60 


D. 


RELIABILITY  GROWTH  ANALYSIS  LOR  SUBSYSTEM  B 


Reliability  growth  for  subsystem  B  is  tracked  over  three  phases  of  testing. 
Reliability  is  tracked  on  a  phase  by  phase  basis.  Similarly,  data  collected  during  each 
phase  is  analyzed  using  the  ReliaSoft’s  RGA  6  PRO  software  [Ref.  21].  The  reliability 
growth  model  selected  for  reliability  analysis  for  the  three  test  phases  is: 

Phase  1 :  AMSAA  Extended  Test-Find-Test  Projection  Model 

Phase  2:  AMSAA  Extended  Test-Fix-Test  Model 

Phase  3:  AMSAA  Extended  Test-Fix-Test  Model 


1.  Phase  1  results  and  analysis 


In  Phase  1,  the  prototype  system  was  subjected  to  1000  kilometers  of  testing. 
During  the  test  five  failures  were  identified  but  all  corrective  actions  were  delayed  till  the 
end  of  the  test.  This  management  strategy  is  known  as  test-find-test.  The  AMSAA 
Extended  Test-Find-Test  Model  is  selected  to  analyze  the  reliability  of  the  system  after 
the  incorporation  of  delayed  fixes.  The  failures  identified  during  the  test  were  classified 
into  their  respective  failure  category  as  shown  in  Table  20. 


j 

Time  to  Event, 
Xj 

Classification 

Mode 

Failure  Category 

1 

159 

BD 

1 

Workmanship 

2 

252 

BD 

2 

Quality 

3 

299 

BD 

3 

Design 

4 

555 

BD 

3 

Design 

5 

967 

BD 

4 

Quality 

Table  20.  Test-find- test  c 

ata  for  Phase  1 
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BD  Mode 

Number  of 
Failures,  AT 

Time  to  First 
Occurrence 

EF,  dj 

1 

1 

159 

0.7 

2 

1 

252 

0.7 

3 

2 

299 

0.6 

4 

1 

967 

0.7 

Table  2 1 .  Test- find-test  Type  B  failure  mode  data  and  effectiveness  factor  for  Phase  1 


There  is  no  Type  A  failure  modes  since  all  failures  will  received  corrective 
actions.  There  are  five  unique  modes  of  Type  B  failure.  All  Type  B  failure  modes  are 
classified  as  BD.  The  EF  for  each  BD  failure  mode  assigned  in  Table  21  is  based  on 
engineering  assessment  on  the  level  of  effectiveness  of  the  corrective  action. 

The  achieved  system  failure  intensity  is  only  contributed  to  by  Type  B  failure 
mode  which  is  estimated  by  equation  4.8: 


X.  =  An 


N  5 

— —  =  — - —  =  0.005 
T  1000 


The  estimated  achieved  MKBF  at  T=1000  kilometers  before  the  jump  is 
M c  =  —  =  200  kilometers 

K 


Next,  the  projected  failure  intensity  is  calculated  is  calculated  using  equation  4.12. 

A  A  m  N  _ 

7=1  1 

The  average  EF  of  the  delay  fixes  is  given  by  equation  4.13 


d  =  Average  EF= 


IX 

7=1 

M 


0.7 +  0.7 +  0.6 +  0.7 
4 


0.675 


The  termA(r  /  BD)  =  Af5Tp  1  from  equation  4.14  is  a  function  of  /?  and  A  .  These 
two  parameters  are  estimated  from  equations  4.5  and  4.6  using  first  occurrence  data  from 
Table  21. 
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NlnT-f^lnXi  [4  ln  1 000  -  (In  1 59  +  In  252  +  In  299  +  In  967)] 

1=1 

=  0.8973. 

N  4 

A  =  — -  = - — —  =  0.008 1 3  failures/kilometers 

Tp  1000°  8973 

h(T/BD)  =  Xj3Tp-1  =  0.00358 


This  metric  h(T  /  BD)  represents  the  intensity  for  Type  BD  failure  modes  that 
have  not  been  seen  during  the  testing  which  also  means  the  rate  at  which  new  distinct 
Type  BD  modes  are  occurring  at  the  end  of  the  test 

With  all  the  above  parameters  defined,  the  projected  failure  intensity  can  be 
calculated. 


A  A  M  N  —  A  4  N 

-dj) — —+  0.675  *0.00358  =0.0041 1 

7=1  T  7=1 


1000 


The  projected  MRBF  due  to  the  jump  is  the  inverse  of  the  projected  failure 
intensity  given  by  equation  4.15. 

1 

M  n  =  —  =  242  kilometers 

P  \ 

For  a  two-sided  confidence  level  of  90%,  the  projected  MKBF  is  between  103  and 
829  kilometers. 


A  sensitivity  analysis  on  the  effects  of  varying  EF  on  MKBF  was  carried  out  and 
the  results  show  a  10  percent  difference  if  the  EF  is  varied  from  0.6  to  0.9.  It  can 
therefore  be  concluded  that  although  MKBF  increases  with  an  increasing  value  of  EF  but 
using  two  extreme  values  of  EF  does  not  produce  very  significant  difference  in  the 
MKBF. 
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Projection  Summary 

0.8973 

PMTBF: 

242.56 

i: 

0.00813 

DMTBF: 

200 

Statistical  Results 

Result 

Test  Value 

Upper 

Cram'er  Von 
Mises  (BD) 

Passed 

0.0919 

0.155 

Table  22.  RGA  6  PRO  projection  summary  and  Cramer  Von  Mises  test  results  for  Phase  1 


The  RGA  6  PRO  generated  results  as  shown  in  Table  22  is  similar  to  the  hand 
calculated  values.  The  Cramer-Von  Mises  statistics  of  0.0919  is  below  the  critical  value 
of  0.155  for  a  significance  level  of  0.1.  Hence  the  hypothesis  that  the  AMSAA  model  is 

appropriate  is  accepted.  From  Table  22,  it  can  be  seen  that  the  [5  value  of  subsystem  B  is 
higher  than  subsystem  A  which  implies  a  lower  growth  rate.  This  is  expected  because 
subsystem  B  is  an  OTS  system. 
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Instantaneous  MKBF  vs  Time 
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Figure  16.  MKBF  projection  for  Phase  1 


Figure  16  shows  the  plot  of  reliability  versus  time  for  subsystem  B  during  the  test. 
The  reliability  of  subsystem  B  is  constant  ( /?  =  1 )  during  the  test  because  no  fixes  were 
implemented  on  the  system  and  therefore  no  growth  is  taking  place  during  the  test.  There 
is  a  jump  in  reliability  at  the  end  of  the  test  due  to  fixes  being  incorporated  into  the 
system.  The  projection  model  estimates  that  the  MKBF  jumps  to  242  kilometers  at  the 
end  of  phase  1  due  to  four  distinct  corrective  actions  in  redesign  and  quality  process  and 
workmanship  improvement.  This  projected  MKBF  value  of  242  has  exceeded  the 
planned  MKBF  of  229  kilometers  which  concludes  that  reliability  growth  is  progressing 
satisfactorily  at  the  end  of  phase  1 . 
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The  failure  trend  in  Figure  17  shows  that  majority  of  the  failures  were  surfaced 
during  the  early  stages  of  the  testing  which  is  typical  of  a  new  system  during  its  initial 
run-in.  These  are  infant  mortality  failures  due  to  poor  quality  and  workmanship  of 
components. 


Cumulative  number  of  failures  vs  Time 
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Figure  17.  Cumulative  number  of  failures  vs  time  plot  for  Phase  1 


In  addition,  the  fraction  of  seen  and  unseen  Type  BD  failure  intensity  can  also  be 
estimated.  The  intensity  for  Type  BD  failure  modes  that  have  been  seen  in  the  testing  can 
be  estimated  as  follows: 

XBD  - h(\000 / BD)  =  0.005-0.00358=0.00142 

The  fraction  of  Type  BD  failure  intensity  due  to  failure  modes  that  have  been  seen 
in  test  is: 

„  c  h(^00/BD) 

Fraction  Seen  = - - - =  0.284 

^ BD 

Fraction  Unseen=0.716 
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Figure  18  displays  the  failure  rate  for  each  individual  Type  BD  failure  modes 
before  and  after  implementing  the  fixes.  It  provides  a  clear  visibility  on  the  failure  rate 
contribution  of  each  individual  Type  BD  failure  mode  to  the  system’s  overall  failure  rate. 
The  failure  management  strategy  should  focus  on  fixing  mode  BD3  as  it  appears  to  have 
the  highest  failure  rate  from  Figure  17. 


System's  Failure  Rate  Breakdown 
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Figure  18.  Before  and  after  failure  rate  for  Type  BD  failure  mode  in  Phase  1 


2.  Phase  2  results  and  analysis 


The  testing  approach  used  in  Phase  2  is  Test- Analyze- And-Fix  (TAAF),  in  which 
fixes  for  all  failure  modes  discovered  are  being  incorporated  during  the  test.  The  total 
cumulative  test  mileage  for  this  phase  is  1600  kilometers.  The  AMSAA  Extended  Test- 
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Fix-Test  Model  is  used  to  analyze  the  reliability  for  this  phase.  All  failure  modes  are 
being  classified  under  Type  BC  in  Table  23. 


j 

Time  to  Event, 

x.i 

Classification 

Mode 

Failure  Category 

1 

89 

BC 

1 

Design 

2 

147 

BC 

2 

Faulty  Component 

3 

356 

BC 

3 

Design 

4 

626.84 

BC 

4 

Quality 

5 

719 

BC 

3 

Quality 

6 

1285.4 

BC 

5 

Design 

7 

1420 

BC 

6 

Design 

Table  23.  Test- fix-test  data  for  Phase  2 


BC  Mode 

Number  of  Failures, 

NJ 

Time  to  First  Occurrence 

1 

1 

89 

2 

1 

147 

3 

2 

434.6 

4 

1 

626.84 

5 

1 

1285.4 

6 

1 

1420 

Table  24.  Unique  first  time  occurrence  BC  failure  mode  for  Phase  2 


During  this  phase,  six  unique  failures  were  observed  in  1600  kilometers  of  testing. 
With  the  above  test  data,  the  demonstrated  MKBF  for  this  test  phase  can  be  calculated. 

The  shape  parameter  is  estimated  using  equation  4.5. 

P  = - 4 - 

AUnT-^lnA 

1=1 

= _ 7 _ 

[6  In  1600  -  (In  89  +  In  147  +  In  434.6  +  In  626.8  +  In  7 19  +  In  1285.4  +  In  1420)] 

=  0.7905 

The  calculated  ft  of  0.7905  (/?<1)  is  lower  than  Phase  1  which  implies  reliability 
improvement  compared  to  the  last  phase. 
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The  relationship  between  the  growth  rate  and  the  shape  parameter  is  given  by 
equation  4.2. 

Bduane  = 1  -  Pamsaa  =  1  -  0.7905  =  0.2095 

The  scale  parameter  is  estimated  using  equation  4.6. 


i  = 


N 


7 

1600°'7905 


0.0205 


The  achieved  failure  intensity  at  the  end  of  the  test  is  estimated  by  equation  4.3. 
XCA  =  r(T)  =  =  0.0205  *  0.7905  *  1600° 7905  1  =  0.00345 


The  demonstrated  instantaneous  MKBF  at  the  end  of  phase  2  after  1600 
kilometers  of  testing  is  given  in  equation  4.4. as  the  reciprocal  of  the  intensity  function: 

1 

Mca  = - =  289  kilometers 

\ CA 


For  a  two-sided  confidence  level  of  90  %,  the  demonstrated  MKBF  is  between 
119  and  953  kilometers. 

Another  useful  metric  is  the  initial  system  instantaneous  MKBF  at  the  beginning 
of  this  phase.  It  is  given  in  equation  4.21. 

_r(l+?)_ 

M  j  — - j - —  156  kilometers  (4.22) 

P 


The  initial  MRBF  of  156  kilometers  at  the  beginning  of  Phase  2  is  within  the 
confidence  interval  of  1 10  and  534  kilometers  at  the  end  of  Phase  1.  At  the  beginning  of 
the  test  it  is  estimated  that  the  initial  system  MKBF  was  156  kilometers  and  due  to  six 
distinct  fixes  the  reliability  grew  to  289  kilometers  at  the  end  of  1600  kilometers  of 
testing. 
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Analysis  Summary 

Model: 

Crow-AMSAA  (NHPP) 

Analysis  Method: 

MLE 

P- 

0.7905 

Test  Procedure: 

Developmental 

i: 

0.0205 

Input  Type: 

Cumulative 

Growth  Rate: 

0.2095 

Termination  Time: 

1600 

Instant. 

MTBF: 

289.13 

Statistical  Results 

Result 

Test  Value 

Upper 

Cram'er  Von  Mises 

Passed 

0.0275 

0.165 

Table  25.  RGA  6  PRO  analysis  summary  and  Cramer  Von  Mises  test  results  for  Phase  2 


The  RGA  6  PRO  generated  results  are  presented  in  Table  25.  The  Cramer- Von 
Mises  statistics  of  0.0275  is  below  the  critical  value  of  0.165  for  a  significance  level  of 
0.1.  Hence  the  hypothesis  that  the  AMSAA  model  is  appropriate  is  accepted. 
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Failure  Intensity 


Instantaneous  Failure  Intensity  vs  Time 


o.io 


0.08 


0.06 


0.04 


0.02 


0 


Time  (Kilometers) 

Figure  19.  Failure  intensity  vs  time  plot  for  Phase  2 


71 


Instantaneous  MKBF  vs  Time 
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Figure  20.  MKBF  vs  time  plot  for  Phase  2 


According  to  the  idealized  growth  curve,  the  expected  MKBF  at  the  end  of  phase 
2  should  be  approaching  296  kilometers.  It  is  obvious  from  the  instantaneous  failure 
intensity  versus  time  plot  in  Figure  19  that  the  decrease  in  the  failure  intensity  is  not 
significant  throughout  the  test.  Clearly,  the  main  reason  is  due  to  the  high  frequency  of 
occurrence  of  failure  mode  BC3  as  display  in  Figure  21.  At  this  point  the  program 
manager  must  focus  on  correcting  this  failure  mode. 
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Failure  Intensity 
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Figure  2 1 .  Failure  rate  of  individual  BC  failure  modes  at  the  end  of  test 


Lastly,  it  may  also  be  of  interest  from  an  engineering  perspective  to  estimate  the 
level  of  effectiveness  of  the  fixes  for  the  six  Type  BC  failure  modes.  The  average 
effectiveness  factor  can  be  calculated  as  follows  [Ref  20]: 


3  ~  Aca 

UBC  ~  ' 


A(bq  -  h(T  /  BC) 


(4.23) 


The  initial  system  failure  intensity  is  the  inverse  of  the  initial  system  MRBF  in 
equation  4.22. 
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=  0.00641 


Aj  Aur 


1 


1 


M,  156 


Since  there  are  no  Type  A  failure  mode  in  the  data,  the  initial  failure  intensity  for 
Type  BC  failure  modes  is  equals  to  the  initial  failure  intensity  of  the  system. 

Aca  is  the  achieved  system  failure  intensity  at  the  end  of  the  test  phase  which  has 

been  determined.  Next  is  to  determine  the  failure  intensity  for  new  Type  BC  failure 
modes  at  the  end  of  this  test  phase.  This  is  the  same  as  equation  4.14  but  considers  only 
BC  modes. 


h(T  /  BC)  =  X/3T 


The  estimate  of  A  and  /?  is  calculated  using  equations  4.  5  and  4.6  based  on  the 
first  time  occurrence  data  for  BC  failure  modes  in  Table  23. 


P 


N 


N\nT-^\nX, 

1=1 


[6  In  1200 -(In  89  + In  147  + In  434.6  + In  626.8  + In  1285  + In  1420)] 


=  0.763 


N  6 

A  =  — -  = - — — — -  =  0.0214 


Tp  1200 


0.763 


h(T/BC )  =  ApT^1  =  0.00304 

Finally  the  average  EF  for  Type  BC  failure  modes  can  be  determined 


7  A- i  ~  Aca 

UBC 


=  0.878 


AI(BC)  ~h(T  t  BC) 


(4.24) 


In  conclusion,  the  six  corrective  actions  remove  an  average  of  87.8  %  of  the 
failure  rate  from  the  six  unique  failure  modes.  An  average  of  12.2  %  remained  in  the  six 
BC  modes.  The  average  EF  of  0.87  is  high  which  implies  that  the  corrective  actions  that 


have  been  incorporated  are  very  effective. 
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3. 


Phase  3  results  and  analysis 


The  TAAF  approach  should  be  applied  in  the  last  test  phase  so  that  the 
effectiveness  of  all  corrective  actions  implemented  during  the  test  can  be  verified  by  the 
end  of  the  test.  Hence  a  more  accurate  assessment  of  the  system  reliability  based  on  the 
current  configuration  can  be  obtained.  In  Phase  3,  all  fixes  are  incorporated  during  the 
test.  The  system  was  tested  for  2250  kilometers  in  this  phase.  During  this  phase  the 
MKBF  of  the  system  was  continuously  assessed  to  provide  constant  technical  and 
management  visibility  on  the  effectiveness  of  corrective  actions  and  program  status. 

Testing  was  terminated  prematurely  at  2200  kilometers  instead  of  the  planned 
2250  kilometers  because  the  system  reliability  has  exceeded  the  requirement.  The  data 
collected  during  the  test  is  presented  in  Table  26. 


j 

Time  to  Event, 

x.i 

Classification 

Mode 

Failure  Category 

1 

36 

BC 

1 

Design 

2 

334 

BC 

2 

Quality 

3 

823.6 

BC 

3 

Faulty  component 

4 

958 

BC 

4 

Workmanship 

5 

960 

BC 

5 

Faulty  component 

6 

1433 

BC 

6 

Quality 

7 

1741 

BC 

7 

Workmanship 

Table  26.  Test-fix-test  data  for  Phase  3 


The  AMSAA  Extended  Test-Fix-Test  Model  is  used  to  analyze  the  test  data  in 
Table  26. 


The  shape  parameter  is  estimated  using  equation  4.5 


AlnT-^lnA 

1=1 


_ 7 _ 

[7  In  2200  -  (In  36  +  In  334  +  In  823 .6  +  In  958  +  In  960  +  In  1433  +  In  1 74 1)] 


=0.7524 
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The  scale  parameter  is  estimated  using  equation  4.6 


7 

22000'7524 


0.0214 


The  relationship  between  the  growth  rate  and  the  shape  parameter  is  given  by 
equation  4.2. 

a DUANE  PaMSAA  =0.25 

The  growth  rate  of  0.25  indicates  a  significant  improvement  compared  to  Phase  2. 
The  achieved  failure  intensity  is  estimated  as: 

XcA  =  r(T )  =  VpTp-x  =  0.02 14  *  0.7524  *  220007524-'  =  0.00239 


The  demonstrated  instantaneous  MKBF  at  the  end  of  the  test  phase  after  2200 
kilometers  of  testing  is  given  in  equation  4.4  as  the  reciprocal  of  the  intensity  function: 

1 

Mca  =  - —  =  417  kilometers 

K CA 

For  a  confidence  interval  of  90  %,  the  projected  MKBF  falls  between  172  and 
1377  kilometers. 

Another  useful  metric  is  the  initial  system  MKBF  at  the  beginning  of  this  phase.  It 
is  given  by  equation  4.2 1 . 

r(i+fi 

Mj  — - j - —  197  kilometers 

P 


The  initial  MRBF  of  197  kilometers  at  the  beginning  of  Phase  3  is  within  the 
confidence  interval  of  103  and  829  kilometers  at  the  end  of  Phase  2.  At  the  beginning  of 
the  test  it  is  estimated  that  the  initial  system  MKBF  was  197  kilometers  and  due  to  seven 
distinct  fixes  the  reliability  grew  to  417  kilometers  at  the  end  of  2200  kilometers  of  test. 
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Model: 

Crow-AMSAA  (NHPP) 

Analysis  Method: 

MLE 

J3: 

0.7524 

Test  Procedure: 

Developmental 

A: 

0.0214 

Input  Type: 

Cumulative 

Growth  Rate: 

0.2476 

Termination  Time: 

2200 

Instant. 

MTBF: 

417.71 

Statistical  Results 

Result 

Test  Value 

Upper 

Cram’er  Von  Mises 

Passed 

0.0647 

0.165 

Table  27.  RGA  6  PRO  analysis  summary  and  Cramer  Von  Mises  test  results  for  Phase  3 


The  RGA  Pro  6  generated  summary  and  statistical  results  are  presented  in  Table 
27.  The  Cramer-Von  Mises  statistics  of  0.0647  is  below  the  critical  value  of  0.165  for  a 
significance  level  of  0.1.  Hence  the  hypothesis  that  the  AMSAA  model  is  applicable  is 
accepted. 

The  AMSAA  Test-Fix-Test  Model  estimates  that  the  demonstrated  MKBF  of  417 
kilometers  at  the  end  of  2200  kilometers  of  testing  has  exceed  the  requirement  of  350 
kilometers.  The  test  was  tenninated  at  a  cumulative  mileage  2200  kilometers  primarily 
for  two  reasons:  1)  It  can  be  observed  from  the  cumulative  number  of  failures  versus  time 
plot  in  Figure  22  that  there  is  a  decreasing  trend  in  the  number  of  observed  failures 
towards  the  end  of  the  test  and  2)  the  MKBF  requirement  has  already  been  exceeded. 
There  are  only  two  failures  observed  in  the  last  1100  kilometers  of  testing.  The  achieved 
growth  rate  of  0.25  is  also  a  significant  improvement  compared  to  Phase  2.  The 
effectiveness  of  the  fixes  in  previous  phases  have  been  validated  which  represents  an 
obvious  reason  for  the  improved  reliability.  Also,  the  j3  value  of  subsystem  B  is  higher  as 
than  subsystem  A  in  all  three  phases  of  testing  which  implies  a  lower  growth  rate.  This  is 
expected  because  subsystem  B  is  an  OTS  system. 

In  conclusion,  the  implementation  of  the  TAAF  approach  for  subsystem  B  has 
been  successful  in  surfacing  and  fixing  potential  failure  modes  and  thus  exceeding  the 
reliability  target.  The  system  grows  from  an  initial  demonstrated  MKBF  of  200 
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Cum.  Number  of  Failures 


kilometers  to  417  kilometers.  The  RGT  program  has  identified  reviewed  and  fixed  a  total 
of  four  unique  design  failure  modes  and  six  quality  process  and  control  failure  modes 
during  the  three  phases  of  testing  leading  to  positive  reliability  growth. 


Cumulative  Number  of  Failures  vs  Time 
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Figure  22.  Cumulative  number  of  failures  vs  time  plot  for  Phase  3 
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Failure  Intensity 
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Figure  23.  Instantaneous  failure  intensity  vs  time  plot  for  Phase  3 
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Mean  Kilometer  Between  Failure  (MKBF) 


Instantaneous  MKBF  vs  Time 
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Figure  24.  MKBF  vs  time  plot  for  Phase  3 


80 


V.  CONCLUSIONS 


The  key  findings  of  the  literature  research  suggest  that  the  disparity  between 
predicted  and  field  reliability  stems  from  some  inherent  assumptions  of  the  MIL-HDBK- 
217  prediction  model.  First,  the  constant  failure  rate  assumption  that  has  been  generally 
applied  in  reliability  prediction  is  not  always  applicable.  However,  Drenick’s  theorem  has 
proven  that  complex  repairable  systems,  under  certain  constraints,  can  be  well 
represented  by  the  exponential  distribution.  The  reliability  engineer  must  be  able  to 
recognize  when  the  mathematical  simplicity  of  the  constant  failure  rate  model  can  be 
used  without  a  substantial  penalty  in  prediction  accuracy.  Secondly,  even  if  the 
exponential  distribution  is  applicable,  the  disparity  between  predicted  and  field  reliability 
may  still  exist  in  new  systems  because  of  unexpected  failure  modes  that  may  arise  in  the 
presence  of  design  and  quality  deficiencies  which  will  prevent  the  system  from  reaching 
the  predicted  value.  One  possible  solution  to  reduce  the  frequency  of  occurrence  of 
unexpected  failures  in  the  field  and  for  the  system’s  reliability  to  approach  the  predicted 
value  is  to  apply  RGT  during  the  development  phase. 

The  results  of  the  reliability  analysis  for  the  combat  system  show  that  the 
demonstrated  system  reliability  for  both  subsystems  is  initially  low.  For  subsystem  A  the 
initial  MRBF  is  only  45  %  of  the  final  achieved  MRBF.  For  subsystem  B  the  initial 
MKBF  is  only  48  %  of  the  final  achieved  MKBF.  However,  the  reliability  for  both 
subsystems  improves  as  testing  progresses.  Reliability  is  finally  estimated  to  meet  the 
predicted  value  as  failure  modes  are  discovered  and  eliminated  through  the  TAAF 
process.  I  conclude  that  the  application  of  RGT  during  the  developmental  phase  is 
effective  in  minimizing  the  disparity  between  predicted  and  field  reliability.  Systems  that 
bypass  development  testing  will  experience  low  reliability  in  the  field,  which  is  one  of 
the  main  causes  of  disparity  between  predicted  and  field  reliability. 

This  thesis  has  successfully  demonstrated  the  detailed  application  of  the  Duane 
Model  and  the  AMSAA  Extended  Models  for  the  reliability  planning  and  analysis  of  a 
combat  system.  Some  of  the  important  lessons  learned  on  the  use  of  the  reliability  growth 
models  are  summarized  below. 
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In  reliability  growth  planning,  the  total  test  time  required  for  the  RGT  program  is 
sensitive  to  the  following  parameters:  1)  system’s  initial  reliability,  2)  initial  test  time,  3) 
growth  rate.  In  most  practical  cases,  the  total  test  time  is  usually  fixed  due  to  time  and 
resources  available  in  the  development  program.  The  most  accurate  way  of  determining 
the  initial  system’s  reliability  is  to  subject  the  system  to  pre-development  testing  and  the 
initial  test  time  must  be  long  enough  for  at  least  the  first  failure  to  surface. 

The  use  of  failure  mode  designation  such  as  Type  BD  and  BC  associated  with  the 
AMSAA  Extended  Growth  Models  has  provided  better  visibility  over  the  AMSAA  Basic 
Test-Fix-Test  Model.  It  allows  generation  of  many  useful  metrics  such  as:  1)  initial 
system  reliability  at  the  beginning  of  the  test,  2)  the  average  effectiveness  factor  of 
remedying  failure  modes,  3)  fraction  of  seen  and  unseen  Type  BD  failure  modes,  and  4) 
system  failure  rate  breakdown  for  individual  failure  modes.  Knowing  the  failure  rate 
breakdown  of  individual  failure  modes  in  the  system  is  important  as  it  enables  easy 
identification  of  failure  modes  with  relatively  high  failure  rate.  The  reliability  of 
subsystem  A  and  B  continues  to  grow  during  the  RGT  because  of  focused  efforts  in 
eliminating  these  major  contributors.  In  addition,  the  ability  of  the  AMSAA  Extended 
Test-Find-Test  Model  to  estimate  the  increased  in  the  system’s  reliability  for  Type  BD 
failure  modes  has  allowed  for  a  more  in  depth  analysis  of  the  test  data.  This  is  especially 
useful  at  the  end  of  the  RGT  program  when  the  demonstrated  reliability  of  the  system  is 
below  the  target  and  due  to  resource  constraints  further  testing  is  not  possible.  It  is 
therefore  important  to  know  if  the  final  system  reliability  can  meet  the  requirements  after 
incorporating  the  delay  fixes.  For  subsystem  A,  the  Extended  Test-Find-Test  Model 
estimates  the  increased  in  MRBF  from  186  rounds  to  206  rounds  due  to  three  distinct 
delayed  fixes  with  an  assumed  EF  of  0.6  and  thus  exceeding  the  MRBF  target  of  200 
rounds.  It  is  also  important  to  note  that  the  final  system  reliability  is  sensitive  to  the 
assigned  value  of  EF.  To  prevent  over  estimation  of  the  system  final  reliability,  a 
conservative  EF  should  be  assigned  since  the  actual  effectiveness  of  the  delayed  fixes 
cannot  be  detennined  without  further  testing. 

For  new  systems  under  development,  the  use  of  the  AMSAA  NHPP  model 

provides  a  better  representation  of  the  system’s  failure  rate  than  the  exponential 

distribution  because  the  failure  rate  is  varying  with  time  as  testing  progresses.  Once  the 
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system  matures  through  a  period  of  testing  and  reliability  growth  has  reached  a  plateau, 
the  system’s  failure  rate  will  tend  towards  being  well  represented  by  an  exponential 
distribution. 

The  use  of  the  various  reliability  charts  such  as  cumulative  failures  versus  time, 
failure  intensity  versus  time,  MRBF/MKBF  versus  time  have  also  provided  management 
and  technical  visibility  on  how  the  system  is  performing  during  the  test  which  is  also  a 
major  factor  that  contributes  to  the  success  of  the  RGT  program. 
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